From: Rusty Russell
Date: Mon, 25 May 2015 05:04:15 +0000 (+0930)
Subject: crypto/shachain: add design document.
X-Git-Url: https://git.ozlabs.org/?p=ccan;a=commitdiff_plain;h=54b56dc5b853700981e837574e88fdc0673319c9
crypto/shachain: add design document.
Signed-off-by: Rusty Russell
---
diff --git a/ccan/crypto/shachain/design.txt b/ccan/crypto/shachain/design.txt
new file mode 100644
index 00000000..02f4c6f6
--- /dev/null
+++ b/ccan/crypto/shachain/design.txt
@@ -0,0 +1,114 @@
+Efficient Chains Of Unpredictable Numbers
+=========================================
+
+The Problem
+-----------
+
+The Lightning Network wants a chain of (say 1 million) unguessable 256
+bit values; we generate them and send them one at a time to a remote
+node. We don't want the remote node to have to store all the values,
+so it's better if they can derive them once they see them.
+
+A Simple Solution
+-----------------
+
+A simple system is a hash chain: we select a random seed value, the
+hash it 1,000,000 times. This gives the first "random" number.
+Hashed 999,999 times gives the second number, etc. ie:
+
+ R(1,000,000) = seed
+ R(N-1) = SHA256(R(N))
+
+This way the remote node needs only to remember the last R(N) it was
+given, and it can calculate any R for N-1 or below.
+
+However, this means we need to generate 1 million hashes up front, and
+then do almost as many hashes to derive the next number. That's slow.
+
+A More Complex Solution
+-----------------------
+
+Instead of a one-dimensional chain, we can use two dimensions: 1000
+chains of 1000 values each. Indeed, we can set generate the "top" of
+each chain much like we generated a single chain:
+
+ Chain 1000 Chain 999 Chain 998 ...........Chain 1
+ seed SHA256(C1000) SHA256(C999) ....... SHA256(C2)
+
+Now, deriving chain 1000 from seed doesn't quite work, because it'll
+look like this chain, so we flip the lower bit to generate the chain:
+
+ Chain 1000 Chain 999 Chain 998 ...........Chain 1
+1000 seed^1 SHA256(C1000)^1 SHA256(C999)^1...... SHA256(C2)^1
+ 999 SHA256(above) SHA256(above) SHA256(above) ..... SHA256(above)
+ 998 SHA256(above) SHA256(above) SHA256(above) ..... SHA256(above)
+ ...
+
+Now, we can get the first value to give out (chain 1, position 1) with
+999 hashes to get to chain 1, and 999 hashes to get to the end of the
+chain. 2000 hashes is much better than the 999,999 hashes it would
+have taken previously.
+
+Why Stop at 2 Dimensions?
+-------------------------
+
+Indeed, the implementation uses 64 dimensions rather than 2, and a
+chain length of 2 rather than 1000, giving a worst-case of 63 hashes
+to derive any of 2^64 values. Each dimension flips a different bit of
+the hash, to ensure the chains are distinct.
+
+For simplicity, I'll explain what this looks like using 8 dimensions,
+ie. 8 bits. The seed value always sits the maximum possible index, in
+this case index 0xFF (b11111111).
+
+To generate the hash for 0xFE (b11111110), we need to move down
+dimension 0, so we flip bit 0 of the seed value, then hash it. To
+generate the hash for 0xFD (b11111101) we need to move down dimension
+1, so we flip bit 1 of the seed value, then hash it.
+
+To reach 0xFC (b11111100) we need to move down dimension 1 then
+dimension 0, in that order.
+
+Spotting the pattern, it becomes easy to derive how to reach any value:
+
+ hash = seed
+ for bit in 7 6 5 4 3 2 1 0:
+ if bit not set in index:
+ flip(bit) in hash
+ hash = SHA256(hash)
+
+Handling Partial Knowledge
+--------------------------
+
+How does the remote node, which doesn't know the seed value, derive
+subvalues?
+
+Once it knows the value for index 1, it can derive the value for index
+0 by flipping bit 0 of the value and hashing it. In effect, it can
+always derive a value for any index where it only needs to clear bits.
+
+So, index 1 gives index 0, but index 2 doesn't yield index 1. When
+index 3 comes along, it yields 2, 1, and 0.
+
+How many hash values will we have to remember at once? The answer is
+equal to the number of dimensions. It turns out that the worst case
+for 8 dimensions is 254 (0b11111110), for which we will have to
+remember the following indices:
+
+ 127 0b01111111
+ 191 0b10111111
+ 223 0b11011111
+ 239 0b11101111
+ 247 0b11110111
+ 251 0b11111011
+ 253 0b11111101
+ 254 0b11111110
+
+127 lets us derive any hash value for index <= 127. Similarly, 191
+lets us derive anything > 127 but <= 191. 254 lets us derive only
+itself.
+
+When we get index 255 this collapses, and we only need to remember
+that one index to derive everything.
+
+Rusty Russell