Chapter 4: Cryptography - Public Keys - 《Mastering Ethereum》

Public Keys

Public Keys

An Ethereum public key is a point on an elliptic curve, meaning it is a set of x and y coordinates that satisfy the elliptic curve equation.

In simpler terms, an Ethereum public key is two numbers, joined together. These numbers are produced from the private key by a calculation that can only go one way. That means that it is trivial to calculate a public key if you have the private key, but you cannot calculate the private key from the public key.

Warning

MATH is about to happen! Don’t panic. If you start to get lost at any point in the following paragraphs, you can skip the next few sections. There are many tools and libraries that will do the math for you.

The public key is calculated from the private key using elliptic curve multiplication, which is practically irreversible: K = k * G, where k is the private key, G is a constant point called the generator point, K is the resulting public key, and * is the special elliptic curve “multiplication” operator. Note that elliptic curve multiplication is not like normal multiplication. It shares functional attributes with normal multiplication, but that is about it. For example, the reverse operation (which would be division for normal numbers), known as “finding the discrete logarithm”—i.e., calculating k if you know K—is as difficult as trying all possible values of k (a brute-force search that will likely take more time than this universe will allow for).

In simpler terms: arithmetic on the elliptic curve is different from “regular” integer arithmetic. A point (G) can be multiplied by an integer (k) to produce another point (K). But there is no such thing as division, so it is not possible to simply “divide” the public key K by the point G to calculate the private key k. This is the one-way mathematical function described in Public Key Cryptography and Cryptocurrency.

Note

Elliptic curve multiplication is a type of function that cryptographers call a “one-way” function: it is easy to do in one direction (multiplication) and impossible to do in the reverse direction (division). The owner of the private key can easily create the public key and then share it with the world, knowing that no one can reverse the function and calculate the private key from the public key. This mathematical trick becomes the basis for unforgeable and secure digital signatures that prove ownership of Ethereum funds and control of contracts.

Before we demonstrate how to generate a public key from a private key, let’s look at elliptic curve cryptography in a bit more detail.

Elliptic Curve Cryptography Explained

Elliptic curve cryptography is a type of asymmetric or public key cryptography based on the discrete logarithm problem as expressed by addition and multiplication on the points of an elliptic curve.

A visualization of an elliptic curve is an example of an elliptic curve, similar to that used by Ethereum.

Note	Ethereum uses the exact same elliptic curve, called secp256k1, as Bitcoin. That makes it possible to reuse many of the elliptic curve libraries and tools from Bitcoin.

Figure 1. A visualization of an elliptic curve

Ethereum uses a specific elliptic curve and set of mathematical constants, as defined in a standard called secp256k1, established by the US National Institute of Standards and Technology (NIST). The secp256k1 curve is defined by the following function, which produces an elliptic curve:

y 2 = ( x 3 + 7 ) over ( 𝔽 p )

or:

y 2 mod p = ( x 3 + 7 ) mod p

The mod p (modulo prime number p) indicates that this curve is over a finite field of prime order p, also written as \(\( \mathbb{F}_p \)\), where p = 2256 – 232 – 29 – 28 – 27 – 26 – 24 – 1, which is a very large prime number.

Because this curve is defined over a finite field of prime order instead of over the real numbers, it looks like a pattern of dots scattered in two dimensions, which makes it difficult to visualize. However, the math is identical to that of an elliptic curve over real numbers. As an example, Elliptic curve cryptography: visualizing an elliptic curve over F(p), with p=17 shows the same elliptic curve over a much smaller finite field of prime order 17, showing a pattern of dots on a grid. The secp256k1 Ethereum elliptic curve can be thought of as a much more complex pattern of dots on an unfathomably large grid.

Figure 2. Elliptic curve cryptography: visualizing an elliptic curve over F(p), with p=17

So, for example, the following is a point Q with coordinates (x,y) that is a point on the secp256k1 curve:

Q =
(49790390825249384486033144355916864607616083520101638681403973749255924539515,
59574132161899900045862086493921015780032175291755807399284007721050341297360)

[example_1] shows how you can check this yourself using Python. The variables x and y are the coordinates of the point Q, as in the preceding example. The variable p is the prime order of the elliptic curve (the prime that is used for all the modulo operations). The last line of Python is the elliptic curve equation (the % operator in Python is the modulo operator). If x and y are indeed the coordinates of a point on the elliptic curve, then they satisfy the equation and the result is zero (0L is a long integer with value zero). Try it yourself, by typing **python** on a command line and copying each line (after the prompt >>>) from the listing.

Using Python to confirm that this point is on the elliptic curve

Python 3.4.0 (default, Mar 30 2014, 19:23:13)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> p = 115792089237316195423570985008687907853269984665640564039457584007908834 \
671663
>>> x = 49790390825249384486033144355916864607616083520101638681403973749255924539515
>>> y = 59574132161899900045862086493921015780032175291755807399284007721050341297360
>>> (x ** 3 + 7 - y**2) % p
0L

Elliptic Curve Arithmetic Operations

A lot of elliptic curve math looks and works very much like the integer arithmetic we learned at school. Specifically, we can define an addition operator, which instead of jumping along the number line is jumping to other points on the curve. Once we have the addition operator, we can also define multiplication of a point and a whole number, which is equivalent to repeated addition.

Elliptic curve addition is defined such that given two points P1 and P2 on the elliptic curve, there is a third point P3 = P1 + P2, also on the elliptic curve.

Geometrically, this third point P3 is calculated by drawing a line between P1 and P2. This line will intersect the elliptic curve in exactly one additional place (amazingly). Call this point P3’ = (x, y). Then reflect in the x-axis to get P3 = (x, –y).

If P1 and P2 are the same point, the line “between” P1 and P2 should extend to be the tangent to the curve at this point P1. This tangent will intersect the curve at exactly one new point. You can use techniques from calculus to determine the slope of the tangent line. Curiously, these techniques work, even though we are restricting our interest to points on the curve with two integer coordinates!

In elliptic curve math, there is also a point called the “point at infinity,” which roughly corresponds to the role of the number zero in addition. On computers, it’s sometimes represented by x = y = 0 (which doesn’t satisfy the elliptic curve equation, but it’s an easy separate case that can be checked). There are a couple of special cases that explain the need for the point at infinity.

In some cases (e.g., if P1 and P2 have the same x values but different y values), the line will be exactly vertical, in which case P3 = the point at infinity.

If P1 is the point at infinity, then P1 + P2 = P2. Similarly, if P2 is the point at infinity, then P1 + P2 = P1. This shows how the point at infinity plays the role that zero plays in “normal” arithmetic.

It turns out that + is associative, which means that (A + B) + C = A + (B + C). That means we can write A + B + C (without parentheses) without ambiguity.

Now that we have defined addition, we can define multiplication in the standard way that extends addition. For a point P on the elliptic curve, if k is a whole number, then k * P = P + P + P + … + P (k times). Note that k is sometimes (perhaps confusingly) called an “exponent” in this case.

Generating a Public Key

Starting with a private key in the form of a randomly generated number k, we multiply it by a predetermined point on the curve called the generator point G to produce another point somewhere else on the curve, which is the corresponding public key K:

K = k * G

The generator point is specified as part of the secp256k1 standard; it is the same for all implementations of secp256k1, and all keys derived from that curve use the same point G. Because the generator point is always the same for all Ethereum users, a private key k multiplied with G will always result in the same public key K. The relationship between k and K is fixed, but can only be calculated in one direction, from k to K. That’s why an Ethereum address (derived from K) can be shared with anyone and does not reveal the user’s private key (k).

As we described in the previous section, the multiplication of k * G is equivalent to repeated addition, so G + G + G + … + G, repeated k times. In summary, to produce a public key K from a private key k, we add the generator point G to itself, k times.

Tip	A private key can be converted into a public key, but a public key cannot be converted back into a private key, because the math only works one way.

Let’s apply this calculation to find the public key for the specific private key we showed you in Private Keys:

Example private key to public key calculation

K = f8f8a2f43c8376ccb0871305060d7b27b0554d2cc72bccf41b2705608452f315 * G

A cryptographic library can help us calculate K, using elliptic curve multiplication. The resulting public key K is defined as the point:

K = (x, y)

where:

x = 6e145ccef1033dea239875dd00dfb4fee6e3348b84985c92f103444683bae07b
y = 83b5c38e5e2b0c8529d7fa3f64d46daa1ece2d9ac14cab9477d042c84c32ccd0

In Ethereum you may see public keys represented as a serialization of 130 hexadecimal characters (65 bytes). This is adopted from a standard serialization format proposed by the industry consortium Standards for Efficient Cryptography Group (SECG), documented in Standards for Efficient Cryptography (SEC1). The standard defines four possible prefixes that can be used to identify points on an elliptic curve, listed in Serialized EC public key prefixes.

Table 1. Serialized EC public key prefixes

Prefix	Meaning	Length (bytes counting prefix)
0x00	Point at infinity	1
0x04	Uncompressed point	65
0x02	Compressed point with even y	33
0x03	Compressed point with odd y	33

Ethereum only uses uncompressed public keys; therefore the only prefix that is relevant is (hex) 04. The serialization concatenates the x and y coordinates of the public key:

04 + x-coordinate (32 bytes/64 hex) + y-coordinate (32 bytes/64 hex)

Therefore, the public key we calculated earlier is serialized as:

046e145ccef1033dea239875dd00dfb4fee6e3348b84985c92f103444683bae07b83b5c38e5e2b0 \
c8529d7fa3f64d46daa1ece2d9ac14cab9477d042c84c32ccd0

Elliptic Curve Libraries

There are a couple of implementations of the secp256k1 elliptic curve that are used in cryptocurrency-related projects:

OpenSSL

The OpenSSL library offers a comprehensive set of cryptographic primitives, including a full implementation of secp256k1. For example, to derive the public key, the function EC_POINT_mul can be used.

libsecp256k1

Bitcoin Core’s libsecp256k1 is a C-language implementation of the secp256k1 elliptic curve and other cryptographic primitives. It was written from scratch to replace OpenSSL in Bitcoin Core software, and is considered superior in both performance and security.