Signing with the Private Key — Bitcoin Wallet Architecture

:::video id=fe3acbf4-a9d4-4c7d-82cc-79de24bf8aec:::

Now that you know how to derive a public key from a private key, you can already receive bitcoins by using this pair of keys as a spending condition. But how to spend them? To spend bitcoins, you will need to unlock the scriptPubKey attached to your UTXO to prove that you are indeed its legitimate owner. To do this, you must produce a signature $s$ that matches the public key $K$ present in the scriptPubKey using the private key $k$ that was initially used to calculate $K$. The digital signature is thus irrefutable proof that you are in possession of the private key associated with the public key you claim.

Elliptic Curve Parameters

To perform a digital signature, all participants must first agree on the parameters of the elliptic curve used. In the case of Bitcoin, the parameters of secp256k1 are as follows:

The finite field $\mathbb{Z}_p$ defined by:

$$ p = 2^{256} - 2^{32} - 977 $$

p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F

$p$ is a very large prime number slightly less than $2^{256}$.

The elliptic curve $y^2 = x^3 + ax + b$ over $\mathbb{Z}_p$ defined by:

$$ a = 0, \quad b = 7 $$

The generator point or origin point $G$:

G = 0x0279BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798

This number is the compressed form that only gives the abscissa of point $G$. The prefix 02 at the beginning determines which of the two values having this abscissa $x$ is to be used as the generating point. The order $n$ of $G$ (the number of existing points) and the cofactor $h$:

n = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141

$n$ is a very large number slightly less than $p$.

$$ h=1 $$

$h$ is the cofactor or the number of subgroups. I will not elaborate on what this represents here, as it’s quite complex, and in the case of Bitcoin, we do not need to take it into account since it is equal to $1$.

All this information is public and known to all participants. Thanks to them, users are able to make a digital signature and verify it.

Signature with ECDSA

The ECDSA algorithm allows a user to sign a message using their private key, in such a way that anyone knowing the corresponding public key can verify the validity of the signature, without the private key ever being revealed. In the context of Bitcoin, the message to be signed depends on the sighash chosen by the user. It is this sighash that will determine which parts of the transaction are covered by the signature. I will talk more about this in the next chapter.

Here are the steps to generate an ECDSA signature:

First, we calculate the hash ($e$) of the message that needs to be signed. The message $m$ is thus passed through a cryptographic hash function, generally SHA256 or double SHA256 in the case of Bitcoin:

$$ e = \text{HASH}(m) $$

Next, we calculate a nonce. In cryptography, a nonce is simply a number generated in a random or pseudo-random manner that is used only once. That is to say, each time a new digital signature is made with this pair of keys, it will be very important to use a different nonce, otherwise, it will compromise the security of the private key. It is therefore sufficient to determine a random and unique integer $r$ such that $1 \leq r \leq n-1$, where $n$ is the order of the generating point $G$ of the elliptic curve.

Then, we will calculate the point $R$ on the elliptic curve with the coordinates $(x_R, y_R)$ such that:

$$ R = r \cdot G $$

We extract the value of the abscissa of the point $R$ ($x_R$). This value represents the first part of the signature. And finally, we calculate the second part of the signature $s$ in this manner:

$$ s = r^{-1} \left( e + k \cdot x_R \right) \mod n $$

where:

$r^{-1}$ is the modular inverse of $r$ modulo $n$, that is, an integer such that $r \cdot r^{-1} \equiv 1 \mod n$;
$k$ is the user's private key;
$e$ is the hash of the message;
$n$ is the order of the generator point $G$ of the elliptic curve.

The signature is then simply the concatenation of $x_R$ and $s$:

$$ \text{SIG} = x_R \Vert s $$

Verification of the ECDSA Signature

To verify a signature $(x_R, s)$, anyone knowing the public key $K$ and the parameters of the elliptic curve can proceed in this way:

First, verify that $x_R$ and $s$ are within the interval $[1, n-1]$. This ensures that the signature respects the mathematical constraints of the elliptic group. If this is not the case, the verifier immediately rejects the signature as invalid.

Then, calculate the hash of the message:

$$ e = \text{HASH}(m) $$

Calculate the modular inverse of $s$ modulo $n$:

$$ s^{-1} \mod n $$

Calculate two scalar values $u_1$ and $u_2$ in this way:

$$ \begin{align*} u_1 &= e \cdot s^{-1} \mod n \ u_2 &= x_R \cdot s^{-1} \mod n \end{align*} $$

And finally, calculate the point $V$ on the elliptic curve such that:

$$ V = u_1 \cdot G + u_2 \cdot K $$

The signature is valid only if $x_V \equiv x_R \mod n$, where $x_V$ is the $x$ coordinate of the point $V$. Indeed, by combining $u_1 \cdot G$ and $u_2 \cdot K$, one obtains a point $V$ which, if the signature is valid, must correspond to the point $R$ used during the signature (modulo $n$).

Signature with the Schnorr Protocol

The Schnorr signature scheme is an alternative to ECDSA that offers many advantages. It has been possible to use it in Bitcoin since 2021 and the introduction of Taproot, with the P2TR script patterns. Like ECDSA, the Schnorr scheme allows signing a message using a private key, in such a way that the signature can be verified by anyone knowing the corresponding public key. In the case of Schnorr, the exact same curve as ECDSA is used with the same parameters. However, public keys are represented slightly differently compared to ECDSA. Indeed, they are designated only by the $x$ coordinate of the point on the elliptic curve. Unlike ECDSA, where compressed public keys are represented by 33 bytes (with the prefix byte indicating the parity of $y$), Schnorr uses 32-byte public keys, corresponding only to the $x$ coordinate of the point $K$, and it is assumed that $y$ is even by default. This simplified representation reduces the size of the signatures and facilitates certain optimizations in the verification algorithms. The public key is then the $x$ coordinate of the point $K$:

$$ \text{pk} = K_x $$

The first step to generate a signature is to hash the message. But unlike ECDSA, it is done with other values and a labeled hash function is used to avoid collisions in different contexts. A labeled hash function simply involves adding an arbitrary label to the hash function's inputs alongside the message data.

CYP201

In addition to the message, the $x$ coordinate of the public key $K_x$, as well as the point $R = r \cdot G$, calculated from the nonce $r$ (which is itself a unique integer for each signature, calculated deterministically from the private key and the message to avoid vulnerabilities related to nonce reuse), are also passed into the labeled function. Just like for the public key, only the $x$ coordinate of the nonce point $R_x$ is retained to describe the point.

The result of this hashing noted $e$ is called the "challenge":

$$ e = \text{HASH}(\text{``BIP0340/challenge''}, R_x \Vert K_x \Vert m) \mod n $$

Here, $\text{HASH}$ is the SHA256 hash function, and $\text{``BIP0340/challenge''}$ is the specific tag for the hashing.

Finally, the parameter $s$ is calculated from the private key $k$, the nonce $r$, and the challenge $e$ as follows:

$$ s = (r + e \cdot k) \mod n $$

The signature is then simply the pair $R_x$ and $s$.

$$ \text{SIG} = R_x \Vert s $$

Verification of the Schnorr Signature

The verification of a Schnorr signature is simpler than that of an ECDSA signature. Here are the steps to verify the signature $(R_x, s)$ with the public key $K_x$ and the message $m$. First, we verify that $K_x$ is a valid integer less than $p$. If this is the case, we retrieve the corresponding point on the curve with $K_y$ being even. We also extract $R_x$ and $s$ by splitting the signature $\text{SIG}$. Then, we check that $R_x < p$ and $s < n$ (the order of the curve). Next, we calculate the challenge $e$ in the same way as the issuer of the signature:

$$ e = \text{HASH}(\text{``BIP0340/challenge''}, R_x \Vert K_x \Vert m) \mod n $$

Then, we calculate a reference point on the curve in this way:

$$ R' = s \cdot G - e \cdot K $$

Finally, we verify that $R'_x = R_x$. If the two x-coordinates match, then the signature $(R_x, s)$ is indeed valid with the public key $K_x$.

Why does this work?

The signer has calculated $s = r + e \cdot k \mod n$, so $R' = s \cdot G - e \cdot K$ should be equal to the original point $R$, because:

$$ s \cdot G = (r + e \cdot k) \cdot G = r \cdot G + e \cdot k \cdot G $$

Since $K = k \cdot G$, we have $e \cdot k \cdot G = e \cdot K$. Thus:

$$ R' = r \cdot G = R $$

Therefore, we have:

$$ R'_x = R_x $$

The advantages of Schnorr signatures

The Schnorr signature scheme offers several advantages for Bitcoin over the original ECDSA algorithm. First, Schnorr allows for the aggregation of keys and signatures. This means that multiple public keys can be combined into a single key.

CYP201

And similarly, multiple signatures can be aggregated into a single valid signature. Thus, in the case of a multisignature transaction, a set of participants can sign with a single signature and a single aggregated public key. This significantly reduces storage and computation costs for the network, as each node only needs to verify a single signature.

CYP201

Moreover, signature aggregation improves privacy. With Schnorr, it becomes impossible to distinguish a multisignature transaction from a standard single-signature transaction. This homogeneity makes chain analysis more difficult, as it limits the ability to identify wallet fingerprints.

Finally, Schnorr also offers the possibility of batch verification. By verifying multiple signatures simultaneously, nodes can gain efficiency, especially for blocks containing many transactions. This optimization reduces the time and resources needed to validate a block. Also, Schnorr signatures are not malleable, unlike signatures produced with ECDSA. This means that an attacker cannot modify a valid signature to create another valid signature for the same message and the same public key. This vulnerability was previously present in Bitcoin and notably prevented the secure implementation of the Lightning Network. It was resolved for ECDSA with the SegWit softfork in 2017, which involves moving the signatures to a separate database from the transactions to prevent their malleability.

Why did Satoshi choose ECDSA?

As we have seen, Satoshi initially chose to implement ECDSA for digital signatures in Bitcoin. Yet, we have also seen that Schnorr is superior to ECDSA in many aspects, and this protocol was created by Claus-Peter Schnorr in 1989, 20 years before the invention of Bitcoin.

Well, we don't really know why Satoshi didn't choose it, but a likely hypothesis is that this protocol was under patent until 2008. Although Bitcoin was created a year later, in January 2009, no open-source standardization for Schnorr signatures was available at that time. Perhaps Satoshi deemed it safer to use ECDSA, which was already widely used and tested in open-source software and had several recognized implementations (notably the OpenSSL library used until 2015 in Bitcoin Core, then replaced by libsecp256k1 in version 0.10.0). Or maybe he simply wasn't aware that this patent was going to expire in 2008. In any case, the most probable hypothesis seems related to this patent and the fact that ECDSA had a proven history and was easier to implement.