From 54b56dc5b853700981e837574e88fdc0673319c9 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Mon, 25 May 2015 14:34:15 +0930 Subject: [PATCH] crypto/shachain: add design document. Signed-off-by: Rusty Russell --- ccan/crypto/shachain/design.txt | 114 ++++++++++++++++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 ccan/crypto/shachain/design.txt diff --git a/ccan/crypto/shachain/design.txt b/ccan/crypto/shachain/design.txt new file mode 100644 index 00000000..02f4c6f6 --- /dev/null +++ b/ccan/crypto/shachain/design.txt @@ -0,0 +1,114 @@ +Efficient Chains Of Unpredictable Numbers +========================================= + +The Problem +----------- + +The Lightning Network wants a chain of (say 1 million) unguessable 256 +bit values; we generate them and send them one at a time to a remote +node. We don't want the remote node to have to store all the values, +so it's better if they can derive them once they see them. + +A Simple Solution +----------------- + +A simple system is a hash chain: we select a random seed value, the +hash it 1,000,000 times. This gives the first "random" number. +Hashed 999,999 times gives the second number, etc. ie: + + R(1,000,000) = seed + R(N-1) = SHA256(R(N)) + +This way the remote node needs only to remember the last R(N) it was +given, and it can calculate any R for N-1 or below. + +However, this means we need to generate 1 million hashes up front, and +then do almost as many hashes to derive the next number. That's slow. + +A More Complex Solution +----------------------- + +Instead of a one-dimensional chain, we can use two dimensions: 1000 +chains of 1000 values each. Indeed, we can set generate the "top" of +each chain much like we generated a single chain: + + Chain 1000 Chain 999 Chain 998 ...........Chain 1 + seed SHA256(C1000) SHA256(C999) ....... SHA256(C2) + +Now, deriving chain 1000 from seed doesn't quite work, because it'll +look like this chain, so we flip the lower bit to generate the chain: + + Chain 1000 Chain 999 Chain 998 ...........Chain 1 +1000 seed^1 SHA256(C1000)^1 SHA256(C999)^1...... SHA256(C2)^1 + 999 SHA256(above) SHA256(above) SHA256(above) ..... SHA256(above) + 998 SHA256(above) SHA256(above) SHA256(above) ..... SHA256(above) + ... + +Now, we can get the first value to give out (chain 1, position 1) with +999 hashes to get to chain 1, and 999 hashes to get to the end of the +chain. 2000 hashes is much better than the 999,999 hashes it would +have taken previously. + +Why Stop at 2 Dimensions? +------------------------- + +Indeed, the implementation uses 64 dimensions rather than 2, and a +chain length of 2 rather than 1000, giving a worst-case of 63 hashes +to derive any of 2^64 values. Each dimension flips a different bit of +the hash, to ensure the chains are distinct. + +For simplicity, I'll explain what this looks like using 8 dimensions, +ie. 8 bits. The seed value always sits the maximum possible index, in +this case index 0xFF (b11111111). + +To generate the hash for 0xFE (b11111110), we need to move down +dimension 0, so we flip bit 0 of the seed value, then hash it. To +generate the hash for 0xFD (b11111101) we need to move down dimension +1, so we flip bit 1 of the seed value, then hash it. + +To reach 0xFC (b11111100) we need to move down dimension 1 then +dimension 0, in that order. + +Spotting the pattern, it becomes easy to derive how to reach any value: + + hash = seed + for bit in 7 6 5 4 3 2 1 0: + if bit not set in index: + flip(bit) in hash + hash = SHA256(hash) + +Handling Partial Knowledge +-------------------------- + +How does the remote node, which doesn't know the seed value, derive +subvalues? + +Once it knows the value for index 1, it can derive the value for index +0 by flipping bit 0 of the value and hashing it. In effect, it can +always derive a value for any index where it only needs to clear bits. + +So, index 1 gives index 0, but index 2 doesn't yield index 1. When +index 3 comes along, it yields 2, 1, and 0. + +How many hash values will we have to remember at once? The answer is +equal to the number of dimensions. It turns out that the worst case +for 8 dimensions is 254 (0b11111110), for which we will have to +remember the following indices: + + 127 0b01111111 + 191 0b10111111 + 223 0b11011111 + 239 0b11101111 + 247 0b11110111 + 251 0b11111011 + 253 0b11111101 + 254 0b11111110 + +127 lets us derive any hash value for index <= 127. Similarly, 191 +lets us derive anything > 127 but <= 191. 254 lets us derive only +itself. + +When we get index 255 this collapses, and we only need to remember +that one index to derive everything. + +Rusty Russell -- 2.30.2