hash it 1,000,000 times. This gives the first "random" number.
Hashed 999,999 times gives the second number, etc. ie:
- R(1,000,000) = seed
- R(N-1) = SHA256(R(N))
+ R(0) = seed
+ R(N+1) = SHA256(R(N))
This way the remote node needs only to remember the last R(N) it was
-given, and it can calculate any R for N-1 or below.
+given, and it can calculate any R for N+1 or above.
However, this means we need to generate 1 million hashes up front, and
then do almost as many hashes to derive the next number. That's slow.
-A More Complex Solution
------------------------
-
-Instead of a one-dimensional chain, we can use two dimensions: 1000
-chains of 1000 values each. Indeed, we can set generate the "top" of
-each chain much like we generated a single chain:
-
- Chain 1000 Chain 999 Chain 998 ...........Chain 1
- seed SHA256(C1000) SHA256(C999) ....... SHA256(C2)
-
-Now, deriving chain 1000 from seed doesn't quite work, because it'll
-look like this chain, so we flip the lower bit to generate the chain:
-
- Chain 1000 Chain 999 Chain 998 ...........Chain 1
-1000 seed^1 SHA256(C1000)^1 SHA256(C999)^1...... SHA256(C2)^1
- 999 SHA256(above) SHA256(above) SHA256(above) ..... SHA256(above)
- 998 SHA256(above) SHA256(above) SHA256(above) ..... SHA256(above)
- ...
-
-Now, we can get the first value to give out (chain 1, position 1) with
-999 hashes to get to chain 1, and 999 hashes to get to the end of the
-chain. 2000 hashes is much better than the 999,999 hashes it would
-have taken previously.
-
-Why Stop at 2 Dimensions?
--------------------------
-
-Indeed, the implementation uses 64 dimensions rather than 2, and a
-chain length of 2 rather than 1000, giving a worst-case of 63 hashes
-to derive any of 2^64 values. Each dimension flips a different bit of
-the hash, to ensure the chains are distinct.
-
-For simplicity, I'll explain what this looks like using 8 dimensions,
-ie. 8 bits. The seed value always sits the maximum possible index, in
-this case index 0xFF (b11111111).
-
-To generate the hash for 0xFE (b11111110), we need to move down
-dimension 0, so we flip bit 0 of the seed value, then hash it. To
-generate the hash for 0xFD (b11111101) we need to move down dimension
-1, so we flip bit 1 of the seed value, then hash it.
-
-To reach 0xFC (b11111100) we need to move down dimension 1 then
-dimension 0, in that order.
-
-Spotting the pattern, it becomes easy to derive how to reach any value:
-
- hash = seed
- for bit in 7 6 5 4 3 2 1 0:
- if bit not set in index:
- flip(bit) in hash
- hash = SHA256(hash)
-
-Handling Partial Knowledge
---------------------------
-
-How does the remote node, which doesn't know the seed value, derive
-subvalues?
-
-Once it knows the value for index 1, it can derive the value for index
-0 by flipping bit 0 of the value and hashing it. In effect, it can
-always derive a value for any index where it only needs to clear bits.
-
-So, index 1 gives index 0, but index 2 doesn't yield index 1. When
-index 3 comes along, it yields 2, 1, and 0.
-
-How many hash values will we have to remember at once? The answer is
-equal to the number of dimensions. It turns out that the worst case
-for 8 dimensions is 254 (0b11111110), for which we will have to
-remember the following indices:
-
- 127 0b01111111
- 191 0b10111111
- 223 0b11011111
- 239 0b11101111
- 247 0b11110111
- 251 0b11111011
- 253 0b11111101
- 254 0b11111110
-
-127 lets us derive any hash value for index <= 127. Similarly, 191
-lets us derive anything > 127 but <= 191. 254 lets us derive only
-itself.
-
-When we get index 255 this collapses, and we only need to remember
-that one index to derive everything.
+A Tree Solution
+---------------
+
+A better solution is to use a binary tree, with the seed at the root.
+The left child is the same as the parent, the right child is the
+SHA256() of the parent with one bit flipped (corresponding to the
+height).
+
+This gives a tree like so:
+
+ seed
+ / \
+ / \
+ / \
+ / \
+ seed SHA256(seed^1)
+ / \ / \
+ seed SHA256(seed^2) SHA256(seed^1) SHA256(SHA256(seed^1)^2)
+Index: 0 1 2 3
+
+Clearly, giving R(2) allows you to derive R(3), giving R(1) allows you
+to derive nothing new (you still have to remember R(2)), and giving
+R(0) allows you to derive everything.
+
+In pseudocode, this looks like the following for a 64 bit tree:
+
+generate_from_seed(index):
+ value = seed
+ for bit in 0 to 63:
+ if bit set in index:
+ flip(bit) in value
+ value = SHA256(value)
+ return value
+
+
+The Receiver's Tree
+-------------------
+
+To derive the value for a index N, you need to have the root of a tree
+which contains it. That is the same as needing an index I which is N
+rounded down in binary: eg. if N is 0b001100001, you need 0b001100000,
+0b001000000 or 0b000000000.
+
+Pseudocode:
+
+# Can we derive the value for to_index from from_index?
+can_derive(from_index, to_index):
+ # to_index must be a subtree under from_index; this is the same as
+ # saying that to_index must be the same as from_index up to the
+ # trailing zeros in from_index.
+ for bit in count_trailing_zeroes(from_index)..63:
+ if bit set in from_index != bit set in to_index:
+ return false
+ return true
+
+# Derive a value from a lesser index: generalization of generate_from_seed()
+derive(from_index, to_index, from_value):
+ assert(can_derive(from_index, to_index))
+ value = from_value
+ for bit in 0..63:
+ if bit set in to_index and not bit set in from_index:
+ flip bit in value
+ value = SHA256(value)
+ return value
+
+If you are receiving values (in reverse order), you need to remember
+up to 64 of them to derive all previous values. The simplest method
+is to keep an array, indexed by the number of trailing zeroes in the
+received index:
+
+# Receive a new value (assumes we receive them in order)
+receive_value(index, value):
+ pos = count_trailing_zeroes(index)
+ # We should be able to generate every lesser value, otherwise invalid
+ for i in 0..pos-1:
+ if derive(index, value, known[i].index) != known[i].value:
+ return false
+ known[pos].index = index
+ known[pos].value = value
+ return true
+
+To derive a previous value, find an element in that array from which
+you can derive the value you want, eg:
+
+# Find an old value
+regenerate_value(index):
+ for i in known:
+ if can_derive(i.index, index):
+ return derive(i.index, i.value, index)
+ fail
+
+You can see the implementation for more optimized variants of the
+above code.
Rusty Russell <rusty@rustcorp.com.au>