Slides: https://joinmarket.me/static/schnorrplus.pdf
This is supposed to be a developer meetup but this talk is going to be a little more on the theoretical side. The title is “Unfairly Linear Signatures” that is just a joke. It is talking about Schnorr signatures. They are something that could in the near future have a big practical significance in Bitcoin. I am not going to explain all the practical significance.
To give you an outline, I will explain what all these terms mean and how they fit in shortly. The theoretical basis, what is a Schnorr signature? Where does it come from? Why is it done the way it is done? Is it secure? That’s what these early theoretical parts are about. After that we will look at uses of Schnorr signatures, specifically in Bitcoin or related to things in Bitcoin. I’m also expanding this onto things called CoinSwaps and multisig with ECDSA. These later parts of the talk about ways we can use Schnorr signatures and to a lesser extent ECDSA to do interesting things. This first half which is more theoretical, I will get you to let me know how you feel about it. Do you find it interesting? Is it all gobbledegook? Or do you find this really cool and you want to learn more about it? In which case you may need to ask someone who knows more about it than me. My level of knowledge of this is ok, high level but I am not going to give you a full university lecture about all the theorems and all that stuff. It is more higher level. The reason I want to look at it is I think sometimes the things we use if we are developing on Bitcoin it all seems a bit magical. Why is the formula for ECDSA the way it is? What is an elliptic curve? This stuff is a bit magical. To the extent I can I want to give you an intuition and a feeling for what this object is. I think the Schnorr signature is particularly suitable for that because it is much simpler than a lot of other things. If we started to have a talk about the elliptic curve addition formula and writing out all the algebra all of it. We would have to go back into some incredible long spiel about mathematics. It is very complicated. We don’t need to do that. ECDSA is a little bit more complicated than Schnorr. Even though Schnorr is coming after ECDSA in Bitcoin, Schnorr is actually simpler to understand. That is what we are aiming to do here. The second half will be a bit more practical about the applications.
To go anywhere with this we first have to understand the idea of a commitment scheme and introduce the topic of a commitment. I’ve called it the telephone game, like Chinese whispers. Everyone knows about tossing coins. Heads or tails and you choose. That is very well known. If you wanted to do that over a communication channel, Chris and I were going to bet on a toss of coin, Chris is going to call it heads or tails and I’m going to toss the coin. The problem is that if we are over a telephone then we can’t see what each other is doing. It is not safe. If he calls “Heads” I can just say “I’ve tossed it and it was Tails”. He doesn’t know if I am lying, I probably am lying if there is any incentive. And vice versa, it is not safe for me if I toss the coin first and say it is “Heads” and Chris says “I chose Heads.” It is not safe. How do you avoid that problem over a communication channel? The marvelous thing is that it turns out that you can avoid that problem specifically if you have a one way function. If you can get something where given the output you can’t retroactively figure out the output you can achieve this goal. Everyone here is familiar with hash functions right? Using a hash I hope you can see that this is the right way to solve that problem. Maybe there is some tweaks that you’d need to make it a completely perfect protocol. The general idea is if Alice hashes the choice that he makes and then gives Bob the hash then Bob can safely reveal the result of tossing the coin. There are some details to that. If Alice just hashes the string “tails”, a 32 byte random string, and sends that over to Bob over the telephone wire or the communication channel, that would not really work. Do you know why that wouldn’t work? Bob can pre-compute the hash of “heads” and the hash of “tails” and compare the result. The protocol without that part is a failed protocol because it doesn’t have a property called hiding. A coin toss first of all needs to hide the result. The problem of doing just a hash of “tails” is that it doesn’t hide the result. Whereas if we include the long random string we are actually getting the hiding effect of whatever the result was. The other effect we have is we need it to be the case that having done this I can’t change the result of the coin toss. That’s what we call binding. That is where the hash function’s property comes in. Having computed the hash of “tails” + long random string Alice can’t then lie after Bob has revealed his result. If it is “heads” she can’t then make up a random string that gives the same hash as “heads”. If she puts “heads” here she can’t find a random string that gives the same hash output. That is second preimage resistance, I believe is the technical term. We want hash functions to have specific properties. That is the idea of a commitment. That solves that problem. It gives a way so that Alice and Bob both can’t cheat. Bob can’t know what Alice chose.
The toy example that we just gave illustrated two key properties, binding and hiding. A commitment scheme has three aspects. A set up, a commitment and an opening. A commitment is when Alice produces her hash result. An opening is when Alice says to Bob “This is the data I used to generate the hash.” That’s commitment schemes. It is very important for what we will discuss next. The zero knowledge proof of knowledge and Schnorr signatures. We can do the same thing using elliptic curve points. This is going to crop up increasingly frequently as I talk. Is everyone familiar with the idea of adding two elliptic curve points together. I don’t mean drawing out curves and all that stuff. Let’s talk about private and public keys because it is something we are all familiar with.
P=xG
G is a generator point, a point on the curve, a specific publicly known generator point multiplied by a scalar number giving another elliptic curve point. An elliptic curve scalar multiplication. It means G+G+G….
x times. It is not a fundamental operation, the fundamental operation is adding two points. It is a group with an addition operation defined for it. Of course you can add the same point to itself. We call that 2A = A+A
. We can do commitments using elliptic curve points instead of using cryptographic hash functions.
We take lower case letters to mean scalar numbers, actually numbers modulo the order of the group. We scalar multiply one elliptic curve point H by one scalar and another elliptic curve point G, the generator point, by another scalar. x is the message we are committing to and r the random string we had in the previous slide.
C = rH + xG
That is the basic idea. There is a lot of stuff. It is really interesting and very relevant to all kinds of things. We can’t do everything in one day so I’m not going to talk about that right now.
The next concept that I think is essential to understand the Schnorr signature is the zero knowledge proof of knowledge. Most people will at least have heard of a zero knowledge proof. A couple of months I started getting very interested in bulletproofs, this new system for confidentiality transactions. What is going on here? We can use a commitment scheme as a tool to help us prove that we know a secret. It is not exactly the same as the telephone game. We are going to prove that we know a secret value using a commitment but without revealing that secret value. It is already a strange thing to claim that you can do. If you have never thought about it before you should realize that every time you make a Bitcoin transaction you are doing that more or less. Because you are proving that you know the private key to validate the spending of the coin. It is connected but not exactly the same thing as a signature. How do we use a commitment scheme to prove that we know a secret without revealing it. Alice revealed her “heads” or “tails” so this is a step forward. This is another level of abstraction we are going to have to go up. We have got our secret here but we can introduce another secret just to be confusing. That new secret which is just going to be a random number I make up, I am going to commit to another random number. Then you are going to have give me a challenge value and I am going to take that challenge value and combine it in a certain way with my commitment and my secret to give you a new value which you know I can only have produced if I knew the secret. The general structure I have just described here where I commit to something, you give me a challenge and then I give you something back that proves is called a Sigma protocol. The nature of the interaction being this way, back, this way and at the end there is a verify. It very vaguely looks like a sigma when you draw it out as an interaction diagram. They often draw diagrams like this. The prover P does something sends something to the verifier V. The verifier does something and sends it toe the prover. And so on and so on.
Q - It sounds like a Diffie-Hellman exchange?
A - A Diffie-Hellman exchange is not a Sigma protocol but it has some similarities.
Here is our Sigma protocol. This is an example, it is not all Sigma protocols but it is the canonical example, the most important one. It is basically what we just said. Alice has a secret x. Bob has P which is the public key, the curve point corresponding to xG. <- means chooses randomly from the set of all scalars between 1 and n-1 where n is the curve order. Alice chooses a random k, sends R=kG, notice the pattern is same as the pattern there. She is doing scalar multiplication so that R although it is derived from k, it does expose k. In the same way as your public key does not expose your private key. She sends R over to Bob. Bob chooses a completely random e from scalar values. The last step is that Alice calculates s = k + ex. She takes her original random value k and takes the challenge value Bob gave her, multiplies by it her secret value and returns that to Bob. Can you see why Alice sending s to Bob proves to Bob that she knows the secret x? What could Bob do with this value s? If we put a G here crudely and we put on a G on everything else do you see what happens? Taking s and multiplying it, scalar multiplication, by the generator point G is the same as kG + exG. R=kG. exG = eP.
Let’s go back a step. We know the general idea that a private key times the generator point gives a public key A. aG = A. Note I am using capital letters for curve points and lower case letters for scalar numbers. bG =B. If we add these together we get aG+bG = A+B or (a+b)G = A+B. All the normal rules of multiplication and addition still apply in this context. The only thing you must avoid getting confused about is you can’t do things like this: AB. There is no such thing. We have a group and the group has an operation defined on it which is addition. It doesn’t have any multiplication defined on it. When we talking earlier about multiplication it was scalar multiplication. If you have a vector you can multiply it by 3 or 7 or 8. That is scalar multiplication. Same thing here because you just add the vector to itself n times. Or in this case add the point to itself n times. When we see equations like this we take this scalar equation, an equation with only numbers and we kind of lift it onto the curve. We multiply G but each one of those scalars. Then we look at what we’ve got. We’ve got sG, that is a new thing. Then we’ve got kG which is R and we’ve got exG which is eP. Why is that important? Because Bob the verifier has all these values. Bob has P, that was defined at the start. He has e, he chose it himself. He has R, notice he does not have k, that is very important. But he has R so he can add eP to R. He can verify whether or not it is equal to sG. If it is the claim is that Alice must have known the private key x. That’s what we are going to discuss next.
This is the center of the presentation. We want to reflect on it. What if we removed the first step? Remember there were three steps. Suppose Alice didn’t bother to send a R. The protocol involved only Bob sending an e. Alice can’t return that anymore, it doesn’t exist so she would be returning s = ex. Would that be a good idea? Would that be secure? There are two dangers here from both sides. The danger to Alice is that Bob learns her private key. We must have it so the protocol does not allow that. The danger to Bob is Alice can generate this value without knowing the private key. We could call that forgery. We want to make sure the protocol doesn’t allow either of those two dangers. If we changed it and wrote s = ex where e is sent from Bob to Alice do you see any problem with that?
Q - You could divide s by e and get x?
A - We multiply s by the modulo inverse of the value e because this is modular arithmetic. It is basically the same as dividing, you can do that because it is a scalar not a point. This version is not secure because Alice has not succeeded in hiding her value.
The first part cannot be removed because without the k in k + ex, x is easily extracted. k is needed to blind the value of x. What about if we removed the second step? If instead Alice generated k and then sent R = kG to Bob and Bob didn’t do anything at all. Then Alice said s = k+x. She sends R and then she sends s. This is a non-interactive protocol so it would be really cool if it worked. What is the problem there? How does Bob verify this? What would Bob’s verification step be? Bob never gets k in the normal protocol, he gets R. Bob would do sG = R+P. He would take the s and multiply by G. He would take the R he has been given and add it to P which is known at the start of the protocol and check if it is equal. This would be valid mathematically but the problem is that because Alice wasn’t forced to get s after an interaction she is able to do something like this. Firstly she makes up a R’ which she knows the k’. Then she subtracts P from the R’. This is what we call a key subtraction attack. It means that when Bob checks that sG = R+P he doesn’t realize that Alice did R = R’-P. She knows that R’ = k’G. When she puts that in there it just becomes (R-P) + P so P is irrelevant now. She has forged the proof that she knew x by subtracting x out of the equation. She can’t do that if we include this second step of a challenge. That is why it is an interactive challenge. k protects Alice and e protects Bob. You can do a lot more thought experiments about this. It is not the only way of achieving this goal but maybe it is the simplest.
What we just talked about is not the Sigma protocol, it is the Schnorr protocol which is one example of a Sigma protocol. The Sigma protocol concept is more general and it means anytime you have this pattern: commit, challenge, response. When I was looking at bulletproofs, the academic background of bulletproofs has several examples of the same structure but more complicated. Maybe you have several commitments or several challenges but it is the same pattern. What we looked at though is the specific example called the Schnorr identity protocol. An identity protocol because it is Alice is proving her identity in a way. She has a private key representing her identity, it is a bit like a password. She is proving that she knows the secret.
What about security? To talk about the security of this construction we need to talk about the definition of a zero knowledge proof of knowledge. I think this is quite self explanatory. It is an abstract term but keep in mind that example where someone is trying to prove they know a secret. It could be more complicated than that but let’s keep it simple. Obviously it is necessary if I do know the secret I should be able to prove it to you. Equally obviously it is necessary that if I don’t know the secret I should not be able to prove it. That is these two definitions here, completeness and soundness. Completeness is always really trivial to prove because it is just like saying “Does the equation actually balance?” If we wrote the equation wrong s = k+ (alpha)ex and no one knew alpha then we can’t produce an answer so it is not complete. Soundness, even though it is the flip concept it is much more difficult to prove that I can’t forge the secret. Zero knowledgeness is the really crazy one. I am going to prove it to you but I am not going to reveal any information doing so. This is critical for a digital signature scheme if we are going to use it for cryptocurrency. Hopefully we don’t reuse address but if we did or we do and we start making lots of signatures on one key, if somebody can get any information from that at all it could be a complete disaster. That is a concrete example of why this concept is so important. It has got much wider applicability than that. We want the proof to reveal absolutely nothing apart from one bit of information. One bit, either I know the secret or I don’t. I was thinking about that today, what if I know half the secret? Have I revealed half a bit of information? I remember reading about if you have non-randomness in your nonce generation then you have these lattice attacks, even from a tiny one bit of non-randomness you can get the secret.
This is a wacky part and if it doesn’t makes sense to you join the club. To try to prove that the proof that we have is sound, that nobody could produce the proof other than the genuine owner of the secret, we need to imagine the prover as being constrained or isolated. You can think of the prover as a program or algorithm. Obviously in real life the prover is an algorithm controlled by a human who happens to know the secret. But you can think of it as a function or a piece of code that contains within it the secret and follows the step: commit, challenge, response; and produces the correct answer and passes the test so to speak. We imagine the verifier trying to extract the secret. The idea of the soundness proof is that he can’t do that. Can he extract the secret if he cheats? Here cheats can only mean cheats with a Prover that executes as normal but we create different Provers in different universes. This sounds ridiculous. I have heard it described in different ways but the way I really like to think of it is the Prover is sitting there, maybe it is on my computer. It is a very small function. Inside that function somewhere must be the secret. It can’t produce the correct verification if the secret is not in there. So what if I’m a programmer and I start running the Prover and then I use break points, I debug it and go into debug mode. I make a second copy of the program, nothing is stopping me doing that. In the literature they usually describe in terms of rewinding and going back to an earlier step. I prefer to think of it as making two copies and going forward one step in one copy and keeping the first one earlier on. Think of the Prover as being isolated as a program that we can control. In that scenario can we construct the secret from that Prover? The program will do something like Step 1 generate a random k, produce R and output R. Step 2 on input of a random number e calculate k + ex and output s. There are a set of steps there. Parts of it don’t have to be deterministic because you generate a random number k. The point is if you can run two copies of it separately…
The three step protocol is Prover chooses k and sends R to the Verifier. The Verifier generates e and sends e to the Prover. s = k + ex is calculated and s is sent. The verification was sG = R + eP where P = xG. The Prover commits, generates k and sends R, the Prover has done Step 1. Verifier branches the Universe, he makes two copies of the program at that point. He sets the break point. He stops the program after the sending of R and makes two copies. Verifier challenges in both. Instead of this flow being one flow there are now two flows, a second flow where he sends a second e and the Prover following its algorithm will calculate a second s. s_2 = k + (e_2)x. It won’t be the same because although the k is the same, the commitment step is the same, and the x is the same, the challenge is different.
You imagine the Prover as a dumb program, it doesn’t know what it is doing. It has got a secret inside it which I represent as a dollar sign, $. The cheating Verifier, we call it the extractor, has total control of its environment. Stops it, copies it and runs it twice. If you do this protocol twice with the same k value and the same private key x you can trivially extract x and k. Take one equation s_2 = k + e_2(x) away from the other equation s = k + ex. This is a very well known weakness both in Schnorr and in ECDSA which is that you can extract the private key if you use the same k value which we we call a nonce. If we use a nonce twice on the same private key. What does that all mean? It means that if we could control it like that we could extract the secret. We can’t control like that because if we could this whole thing wouldn’t work. Every time you sign the thing I could take the private key and I control you. But I don’t control you. Why am I saying this? In a scenario that is not real. There is no God, there is no controlling the Prover. What does it prove? It proves that x is known by this machine. Imagine you have 100 dollars, God can steal 100 dollars from you because he is omnipotent. But can God steal 100 dollars from you if you don’t have 100 dollars? Even though he is omnipotent he can’t. He could imagine 100 dollars, put it in your pocket and then take it out but then he wouldn’t be stealing it from you. This is the basis of this argument. The basis is that you can only extract something that is already there. No matter how ludicrous or ridiculous the set up of the proof is the fact that the secret comes out means that the secret was in there. What this does is it proves that that protocol that we described does actually have knowledge soundness. It is impossible to forge the signature or the identity of Alice. k is essentially a private key, it is just an ephemeral private key that we use once and throw it away. Are you with familiar with the term nonce? n once, a number used once.
This is also weird but the opposite problem. To remind you we showed how to prove soundness, vaguely but we gave the general idea. Zero knowledgeness, we want to prove now that doing this protocol Alice doesn’t reveal any information to Bob. This is very important. I am going to sketch it. What we want to do is produce transcripts of the protocol that look as if they are genuine. What am I talking about transcripts? Think of this as a conversation between Prover and Verifier, Alice and Bob. Prover sends R to Verifier. Verifier sends e to Prover. Prover sends s to Verifier. You can summarize this as (R, e, s). The conversation that the Prover and the Verifier have with each other. This is a transcript of the conversation. The great insight of the original zero knowledge paper written by Shafi Goldwasser and others in the mid 1980s was imagine you have a lot of different protocol executions, lots of conversations with the same private key. It doesn’t follow by the way that if you do lots you get this problem as long as you don’t use the same k. If we have k_2 then we don’t know k, k_2 or x.
Q - What is the probability of generating the same k?
A - It depends on the number of bits. It has to be a scalar so if the group is size of 256 bits then it is 2^256. But if the group is tiny… Bitcoin is slightly less than 256 bits. It is the same level of security you have with private keys. You generate private keys randomly, you can generate as many addresses as you like in your wallet. Nobody is going to tell you “You might accidentally collide” it is not going to happen.
Notice even if you make two signatures or two conversations with the same private key as long as you use different k you are fine. Because you have three unknowns and two equations there. It will always be overdetermined. You can’t solve it. In normal operation we makes lots of these conversations. It is interesting that the values in the conversation (R, e, s) are all random numbers. Even though R is a point not a scalar it is still a random value. What if you forge lots of conversations that looked similar? Could I make another conversation (R’, e’, s’) in such a way that it would like a proper conversation to an outsider. Is it possible to do that?
Q - Do you mean convince them that the proof is valid?
A - The reason there is confusion here is because we are talking about an interactive protocol. If you and I engage in the protocol we just discussed I can prove to you that I own the private key. But if you take the conversation we had and gave it to someone else it doesn’t prove it does it? Because you had to be the one who chose the challenge.
What they said in the paper was if we can set it up so that we can create a whole set of conversations that look valid specifically in the sense of sG = R + eP being validated then it must be the case that no knowledge is exposed. If an adversary, an attacker can make lots of conversation transcripts that look exactly the same at least in terms of distribution, obviously they are not literally the same, as fake transcripts then he can’t be receiving any information when he receives real transcripts because he doesn’t know the difference. It is a very weird argument. How it works in practice that we have the notion of a Simulator, someone who simulates these conversations. All the Simulator needs to do is work backwards. That’s the trick. What makes this work is causality. The fact that I can only do that after you do that. But if we can violate causality, create s randomly, then create e randomly what would we do next? Solve for R. We can solve for R because we have s and e. You can do sG - eP = R. Notice interestingly you don’t get k but you do get R and that is all you need to make a conversation transcript that looks like a real one. If I can make an infinite number of fake conversation transcripts that look exactly the same to an attacker as real ones… By look the same I mean statistically look the same, obviously not the same numbers. You can make lots of R, s and e. If the distribution of fake transcripts is indistinguishable from the distribution of genuine ones it proves that zero information is conveyed. What statistical distribution are we talking about here? The uniform random distribution, I’m not going to try to prove that.
Q - Are you saying that in the proof of this statistics comes into it?
A - I think it could. I glaze over it when it gets to that level of detail. It wouldn’t be distributions like exponential or Poisson.
Q - You could have all these transcripts, real and fake transcripts, but there is no way to convince someone that a transcript is fake. Therefore there is no knowledge inside it. The fact that you can’t convince someone that it is safe. It could be real or fake. It is indistinguishable.
A - Absolutely.
This is an annoying add-on but it is really vitally important in practice, what is called a Fiat-Shamir transform. We have been talking about Sigma protocols and specifically the Schnorr identity protocol as an example. If we want to make it non-interactive, it is no use just having protocols where I prove to Chris, I prove to you etc. We need something where I can prove and make the proof transferable. Prove it to the whole world so everyone can see that I know the secret. To do that we need to make the protocol non-interactive, to remove this step. We just spent an hour explaining why we need that step so how the hell does that work? The way we do it is very ingenious. We use again a hash function. We need a random oracle, that is the technical term. But let’s think of it as a hash function. What we do is we hash the conversation up to the point of the challenge. In this case that is very simple. That is just the value R. If I hash R I create a random value output which we could not have predicted. Because we could not have predicted it we can’t cheat. Remember we had the problem that if you removed this step we found that it was not secure because we had k + ex and you could subtract the public key easily. But if you hash this you still do get a number here but it is not a number you can predict at the start. That is the key. We don’t know e in advance and so we can’t do an attack like that to get s without knowing x. I am trying to give you a really concrete explanation of the concept. The concept is the hash of the conversation up to the challenge and make that the challenge. That is something that Alice can do on our own. Bob doesn’t need to be there. She just hashes R. It is like simulating the challenge. So now the Schnorr signature. What is it? It is exactly that, what we have just discussed. Using that Fiat-Shamir trick because otherwise we can’t make a signature non-interactively. Of course we have to put the message somewhere. We include the message in the hash. We may or may not, it is better if we do, also include the pubkey. It doesn’t actually hurt if you think about it. You can put anything in there. We put the message that we are signing. Now it is s = k + H(m | P | R) x. e has changed from being a random number to being a hash of the message and crucially the point R and possibly the point P. We have changed our identity protocol into a signature scheme. Hash one-wayness enforces ordering of steps in absence of Verifier enforcement. A | B represents concatenation of A and B. m and P and R are all hashed together.
We’ve got most of the theory out of the way. These next few points are a bit vague. We talked about zero knowledge but we didn’t really talk about the fact that that proof is really limited. It is what is called an Honest Verifier’s zero knowledge because we are assuming that the e value, the challenge given by the Verifier is random and not deliberately malformed in some way. The way we get round the fact that all these proofs were based on interactivity is we program the random oracle. The hash function is what is called a random oracle. Does that phrase mean anything to anyone? Forget about hash functions for a minute. Our task is to create is something that creates a random output given any input but does it deterministically. Imagine I’m the random oracle, you give me an input string “Hello”. I’ve got this massive table and I write in my table “Hello”, output “03647….”. The point is that if anyone says “Hello” I am going to respond with the same thing. But if someone says “Goodbye” I’m going to make up a new random value. A hash function is an attempt to create that effect using some mathematical tricks.
Q - Is the hash function secret? Are you announcing what the hash function is?
A - When they go through all these security proofs they use this concept of the random oracle model. The thing about hash functions is they are actual concrete mathematical functions. We don’t know for sure that they behave exactly the way we intend them to. From a theorist point of view they define a random oracle as an ideal of what the hash function should do which is to output deterministic random values for any input. There are no secrets involved. It is just that you can’t predict what the output will be given the input.
Q - You are able to calculate the output because you know the hash function. But does the person who is giving you the input know what the hash function is?
A - It is the same as any hash function in that regard. Anyone can do it. The point is that they can’t predict the output so you get this forced causality.
The Simulator programs the random oracle. Program the random oracle to output the e that you want it to output in order to make this proof still work even though we are not interacting anymore. Matthew Green has got two really good blog posts on zero knowledge proofs. In the second one he talks about this and how it is a really dodgy proof. I’ll let the experts argue about that.
So what security do we have? We have some concept that all of this is sort of secure. We half proved that you can’t forge it and you can’t extract the private key. But it was all based on a very basic assumption which is the very first thing I wrote on the whiteboard was P = xG, the public key and the private key. The idea is that you can’t get x from P. In the elliptic curve context that is called the elliptic curve discrete logarithm problem. The problem of extracting the elliptic curve discrete log of P which is x given only P. It is not impossible because you could simply look through every possible value of x and you could find it. We call it that computationally hard because we don’t have algorithms that calculate it quickly. We treat it as a one way function but it is not perfectly true. Given enough computational power you could find the input simply by exhaustive search. There are algorithms that allow you to do it. ECDLP is lost to a sufficiently powerful quantum computer. It can be shown that if an attacker can extract the private key from a Schnorr signature, they can also solve the ECDLP. We can say under a certain set of assumptions if the elliptic curve discrete logarithm problem is not cracked, solved in sub exponential time then the Schnorr signature is secure.
The claim is based on an argument along the lines of what we’ve just discussed. If you have an adversary that is able to crack the signature scheme, an adversary that with one of this type of protocol can extract the x just by verifying or looking at it. Let’s say you are able to do that with probability epsilon, you are not going to crack it every time, just one time in ten let’s say. If we have such an adversary that adversary can crack the elliptic curve discrete logarithm problem. There’s two slightly different things. Here’s a Schnorr signature. Let’s say I claim that I am the holder of the private key and I can make a signature. And let’s say Chris actually holds the private key and I don’t. If I can impersonate Chris by making a Schnorr signature it doesn’t necessarily mean that I have the key. It might be that there is something broken. What this argument says is that if I can impersonate Chris it means I can solve the elliptic curve discrete logarithm problem because I can extract the discrete log by impersonating twice. This same argument we used before that if you use two different challenges on the same commitment you end up being able to extract x. Roughly your chance of success in extracting the discrete log is the square of epsilon. That is less. Instead of one tenth we are talking about one hundredth. They talk about these arguments as security reductions. The signature scheme is secure because we can’t solve the elliptic curve discrete log. If the signature scheme was not secure then I could do this and I could solve ECDLP. I don’t think anyone can solve ECDLP therefore the signature scheme is secure. This is what they call a reduction.
Q - You are proving they are equivalent problems. If you can solve one you can solve another.
A - But the annoying thing is that they are not literally equivalent.
Q - You have the complexity classes like P, NP. Non deterministic polynomial classes which are NP problems and we are simulating them in exponential time nowadays. You have the concept of reduction. What this means is if you solve one problem in a NP class and all the problems have the property of reduction with one problem you can solve all of them.
I’m not sure if I’ve got much to say about this. I was reading about what is digital signature security. There seems to be a lot of different concepts. It makes sense but this is getting complicated now. You can talk about obvious things like is the signature scheme against someone extracting the private key? But also is this secure against forgery? Also is this secure against forgery if I’ve already got several examples of the signature? Also is it secure against forgery if I manage to do several signatures in parallel? I don’t know much about that stuff.
Q - It would need to be secure against all of those?
A - I think this in bold is the strongest definition of security. “Security against existential forgery under chosen message attack.” If I can get you to output signatures for messages that I choose you still can’t create a new signature.
I want to make a few notes about ECDSA. I’ll stick to the same notation. P = xG. k = nonce. For ECDSA we have a different equation for the signature. Anyone want to tell me? s = k^(-1) (H(m) + rx) We publish the signature as two values, r and s. With the Schnorr signature we publish as either e and s or R and s. (r , s) is an ECDSA signature. What is r? r is the x coordinate of R (R_x). R = kG. This is the same as Schnorr but we then do something very different. We take the x coordinate. If I take the signature I have been given and take the modular inverse of it which is annoying, and multiply it by the hash of the message times G plus the x coordinate of the R value, which is published as part of the signature, multiplied by the public key and then take the x coordinate it should be equal to r. That is how we verify a ECDSA signature.
s^(-1) (H(m)G + rP)|_x = r
As you can see the formula is more complicated. It uses modular inverse when you are creating it and when you are verifying it. It also does this weird thing of taking the x coordinate of a point. There are two or three different things that are a bit strange about it. This was created I think in 1992. It was created because shortly before that Claus Schnorr had patented the Schnorr signature and so the simpler version that we just studied they were not able to use. They had to use a different formula. I don’t know how they came up with this exact form. I am sure it has some good properties and some bad properties. I will note a couple of bad properties. Firstly because r is a x coordinate, there are going to be two points on the curve with the same x coordinate. This can be explained easily without pictures. The formula for our elliptic curve is y^2 = x^3 + 7. Given a particular x we have two different y values. Two different points (x,y) and (x,-y) on the curve for which this verification is possible. All we are checking is a x coordinate. That creates what you might call an intrinsic malleability in the signature scheme because it means that if (r,s) is a valid signature so is (r,-s). BIP 66 made sure that only one of these is standard in Bitcoin now. If you have a really old wallet it might generate signatures which don’t broadcast properly on the network. ECDSA fundamentally has that property, we can’t get around it because it is using the x coordinate. You could argue it is just a technical problem. Cryptographers have a serious problem with that because it is a bad property for a signature scheme to have. You shouldn’t have a situation where given a signature I can make another one that is different even though I don’t have the key. That is not desirable. There is a security reduction which we mentioned on the previous slide. It is difficult but as far as I understand it the cryptographers are not able to find anywhere as clear a security reduction for ECDSA. It makes sense because of this weird x coordinate thing. I have a link to a paper (Vaudenay) if anyone is interested.
Q - The safety of Bitcoin shows that it actually works?
A - I think that is a fair point. Maybe it depends on the way you use it and if you change the way you use it… If we are in a position of not knowing it is not good. I am not trying to claim that the Schnorr signature is perfect. Given a certain set of assumptions it seems that you can construct a very elegant proof.
I do want to mention this. No linearity (especially over nonces due to funky use of x-coordinate). If you look at the Schnorr equation, it is very nice and linear. If you add the two equations together because there are no non-linear terms you get something that is consistent.
What if we add signatures (s = k + ex) together? Alice and Bob can do a signature, add them together, we won’t know each other nonce values or private keys but we’ll be adding them. For that to be linear the e’s can’t be different. e must commit to both nonces like e = H(R_A + R_B | P_A + P_B | m). But this is insecure. It is insecure for the reason that we touched right at the start. Why is that insecure?
Q - Alice can choose her public key to cancel out Bob’s public key.
A - That’s right.
If keys P produced ephemerally, open to direct key subtraction attack, last player can delete everyone else’s key, disaster for multisig. This attack only applies if we don’t have the public keys fixed in advance. Or if the Verifier doesn’t know the separate the public keys. Generally speaking it is not a secure approach because we can subtract other people’s pubkeys. We will go round the room and add our public keys together. The last person will make a public key that subtracts all the other ones. This is rather annoying because then you can spend the multisig on its own. It is supposed to be five people agree and he does it all on his own. That’s terrible, absolute disaster. This is called key aggregation. What you want is to make a signature where you publish on the blockchain only one public key. That public key I’m calling the aggregated public key here. In order to do that we are going to have to interact with each other unfortunately. Interact, share each other’s R values and share each other’s public keys and then build this rather complicated equation for the signature. What I am writing here is specifically the construction that they are calling Musig. We are aggregating keys and the Verifier is going to be able to look at it as just one key. That is very good in terms of saving space and privacy. But it does require that all of us interact in order to create the signature. There is more than one way to do it. It turned out about a month ago that there was a flaw in the security proof of one way of doing it. But it was ok because there is another way of doing it changing it from two rounds of interaction to three rounds of interaction. That doesn’t make that much difference. You should check the latest version of the paper because they changed something. The basic gist of it is that you combine and interact your keys and make it safe from this attack so it is not the case that someone can do it on their own. And you have this property that it is private because you only have one signature and public key on the chain. Other people can’t tell even that it is a multisig. Not only do they not know who they don’t even know it is a multisig. They only see a Schnorr signature. That is why this is so exciting to a lot of Bitcoin developers. They have been really focused on it for a long time.
Q - You could put a MAST script in that as well?
A - Taproot also uses that linearity. I have deliberately avoided talking about it here but it is a very important additional point.
As well as Musig there is something called Bellare-Neven which existed before Musig which is another way of doing the same thing. It doesn’t aggregate the keys in the same way because it would require you to publish all the public keys. Have people used a multisig before? If you ever do it in Bitcoin you will know you will see that all the public keys are published and however many signatures you need are also published. It is a lot of data. Bellare-Neven has a cleaner security proof but since it requires your keys for verification you would have to publish them. Remember with Bitcoin verification is not one person it is everyone. You would have to publish all the keys. The latest version of the Musig paper talks about three rounds of interaction because you commit to these R values first. Everyone doesn’t just handover their R values, they hash them and they send the hash of the R values. That forces you to fix your R value before knowing anyone else’s R value. There is a link here. There are different ways we can use one or both of these constructions (Musig and Bellare-Neven) which are slightly different but have this goal of allowing us to aggregate either the keys or the signatures. We could have multiple signatures aggregated together using these clever constructions. It is all based on the Schnorr signature and combining multiple Schnorr signatures together.
Q - The per transaction aggregation would mean there is just one transaction in a block?
A - A weird detail that you might not immediately think of is each input, it is not the same message that you are signing. Even though you are signing the same transaction you are signing a slightly different version of it. I think it is with Bellare-Neven that you can make one signature for all the inputs. Per block aggregation sounds intriguing.
Q - That sounds like it is batch validation. You add up all the signatures in a block and validate them at once. It says whether the block is valid or not.
A - That is something I forgot to include in all this. Batch validation of Schnorr signatures is possible but it needs a trick that is similar to what we talked about where you have the problem of things being subtracted. If you use the verification equation for lots of different signatures and add them altogether it has a similar security problem. It turns out it is easily solved by adding random terms to each part of the equation. That is another topic that I don’t know much about.
I don’t think we have time to do CoinSwap. We’ve got how you can use a Schnorr signature to make CoinSwap better. Something called an adaptor signature. So what is a CoinSwap?
Q - You can think of it as Alice and Bob want to trade coins and they do it by way of putting their coins in escrow and releasing the escrow securely. In practice it used for privacy. Alice’s coins can end up being possessed by Bob and the other way round. A bit like a mixer that can’t run off with your coins.
Q - A private atomic swap?
Q - An atomic swap with the intention of privacy instead of trading.
Here is an approach that I and several other people worked on last year mostly although it has been talked about for many years. Greg Maxwell came up with the idea in 2013. The blue and orange is input and output. It is Alice and Carol, not Alice and Bob. The interesting part is at the bottom here where you have a HTLC (hash timelocked contract). The idea is that this output can be spent in one of two ways. Either somebody knows a secret value or after a timeout someone can spend it.
Q - Is that inside a script?
A - Yes inside a Bitcoin script. You have CLTV (CheckLockTimeVerify) or CSV. With CLTV you put a block number in there and it checks that against the nLockTime field. The nLockTime field only allows you to broadcast… The details don’t matter. The point is you put a block height in the script and that branch of the transaction is not valid until after that block.
What we are doing is linking two transaction outputs together and trying to create a situation where if one gets spent the other one gets spent. If it doesn’t get spent at all then it times out and the money goes back to the start, whoever paid it. We set up these weird complicated scripts but they are not private because they have a special secret value which links them. What we do is create an alternative output which looks like a normal output and even both parties agree then the money goes to the normal outputs. It is only if one of the parties don’t cooperate then we have to use the non-secret. It is non-secret because it has a preimage of a hash which is a very special value that anyone can link. The only reason I included CoinSwap in this presentation is that it is one example of how the linearity in the Schnorr signature allows us to do clever things.
We are by now familiar with the idea that s = k + ex where e is the hash of the message, the R value and possibly the P value. What you do here is you embed a secret into the nonce. You have some secret value which we are calling t and you add it to k. Effectively the nonce is now k + t and not k. Apart from that it is the same before notice we also put T here. This is the same pattern, we have just added an extra variable.
s = k + H(m|R|P)x -> s = k + t + H(m|R+T|P)x
We are working with a counterparty and we want to swap coins let’s say. That’s the obvious application. I am going to give Chris not this secret t value, but its corresponding public key T. I am also going to give him something else. This is the clever idea and it comes from Andrew Poelstra. What you do is not just give him that public value but give him something which is like a signature but is not a valid signature. We call that an adaptor signature.
s’ = k + H(m|R+T|P)x
Why is that not a valid signature? Because there is a T here. The rule of the signature is that this hash is supposed to be the message, the nonce point R and the public key. Here it is not the nonce point, it doesn’t correspond. If you went through those equations it would not work out because this k doesn’t correspond to the private key of that. I send him the adaptor signature. Even though it is not valid signature he can verify that it is an adaptor signature because if he multiplies by G he gets what he expects to get. He takes s’ which I have given him and multiplies by G and he asks himself is it or is not equal to:
s’G ?= R + H(m|R+T|P)P
I gave him T before. He is doing the verification equation but he is missing out that there. Having verified that what does it tell him? It tells him that if the Prover gives t then you can build a real signature. The starting step was I as Prover create t, set T=tG, send to Verifier T and s’ the adaptor signature. The Verifier checks the adaptor signature fits this equation which is not a proper signature verification equation because there is no +T
which there should be to match. But he can still check it and he knows that if that check passes and it later the Prover gives him t, the Verifier can construct the signature himself. All he does is take s’ and add t s+t
. He doesn’t know the private key. s’ blinds the private key in the same way as an ordinary signature does. By adding this t he will be able to create a valid signature using the Prover’s private key. That’s the basic idea. I won’t go through the details, it is complicated. The idea is we cooperate to make a multisig Schnorr key. The idea is that if we take two different adaptor signatures with the same t then when I spend my one I am going to reveal t. You are going to take t and apply it to your signature. There are two transactions. One is paying out to me and one is paying out to you. If you are the one who has t, if you spend your one and publish the signature it exposes t and allows me to add t to my signature and broadcast my transaction. Functionally it is exactly the same as this thing here. This says “Pay to Carol if she knows the preimage of a hash value.” It is the same thing. Here we are using this like a hash function. t is like the preimage and T is like the hash value. This is all coming from the linearity. The idea that signatures are added together or subtracted to do interesting things. There is another idea called discreet log contracts which is different but it is a similar idea where you use the fact that Schnorr signatures are linear. When one gets published on the chain you can add or subtract it and get some new secret. Or maybe claim some coins. There are many variants of that idea. There is a blog post about it here. It is worth mentioning deniability. The cool thing is when you publish a signature like this to the chain to claim the coins my counterparty is able to subtract s’ to get t. Nobody else in the world has any idea, it is just a random number. A very powerful idea.
You can try to do something similar to what I just described using ECDSA but it requires a different cryptographic structure called a Paillier crypto system. It is completely separate to Bitcoin. It is more similar to RSA than to elliptic curves. It has some special property that might allow us to do something similar to that using ECDSA. Then maybe we can do completely private CoinSwaps today without Schnorr signatures. I have no idea when Schnorr signatures are coming to Bitcoin.
Q - MuSig was invented specifically for Bitcoin. It is being built from the ground up.
A - I wouldn’t say from the ground up. That Bellare-Neven scheme is similar. If you look at the MuSig paper it has a tonne of references to people who have tried to do this kind of thing before. All of these schemes were broken due to variants to that thing where you subtract other people’s keys away. It is not just a question of being broken, it is also a question of is the security proof actually valid? If we don’t have a proper proof of security it is scary. They have a detailed security proof for MuSig, it is really complicated. About one month ago Gregory Neven plus some other people wrote a paper explaining why the security proof of MuSig with only two rounds is insecure. Two rounds means we will make our R values and send them. Then we use the formula to make our keys and signature. We hand out our signatures and we add them together to get the result. This paper from Gregory Neven et al showed that the security proof is not valid. It was an argument against MuSig and a threshold signature scheme. It is not that we know the two round version of the protocol is insecure, it is that we don’t have a security proof of it that is valid. The three round has a solid security proof.
Q - Is Schnorr sufficiently battle tested?
A - The basic Schnorr signature is a much stronger security proof than ECDSA. Doing this clever multisig stuff. That is another matter. It is a heavily studied problem.
Q - It needs exposure to the real world?
Q - The simpler the better?
A - You would think so but with key subtraction the simplest possible way to combine signatures is completely insecure.
Community-maintained archive to unlocking knowledge from technical bitcoin transcripts