We shall start by presenting the basic definitions and facts about elliptic curves. We shall include only the minimal amount of background necessary to understand the applications to cryptology.

Def: An elliptic curve over K is the set of points (x,y,z) in the projective plane PG(2,K) which satisfy the equation:

The fact that makes elliptic curves useful is that the points of the curve form an additive abelian group with O as the identity element. To see this most clearly, we consider the case that K = **R**, and the elliptic curve has an equation of the form given in (3). For a point P = (x,y) (not equal to O) on the curve, we define -P to be the point with coordinates (x,-y), which by (3) is also a point of the curve. Geometrically, recalling that the lines through O are the vertical lines, we see that -P is obtained as the third point of the curve on the line determined by P and O. If P and Q are two points of the curve, then let R be the third point of the curve on the line determined by P and Q (if the line is tangent at P, let R = P, and if it is tangent at Q, let R = Q). We then define the "sum" P + Q to be the point -R. If P and Q are the same point, then take, as the line determined by the two points, the tangent at P, and let R be the other point of the curve on this line (and in the case that P is a point of inflection, R = P). By defining -O = O, we see that O + P = P and P + (-P) = O, for all points P on the curve, so O acts as an identity element. There are several ways to prove that this definition of P + Q makes the points of the elliptic curve an abelian group. One can use a projective geometry argument, a complex analytic argument with doubly periodic functions, or an algebraic argument involving divisors on curves ... but we shall not prove this here.

To obtain a formula in terms of coordinates in this case, consider the general case of P and Q being distinct and not on the same vertical line. Let P = (x_{1},y_{1}), Q = (x_{2},y_{2}) and P+Q = (x_{3},y_{3}). The line through P and Q has an (affine) equation of the form y = mx + k, where m = (y_{2} -y_{1})/(x_{2} - x_{1}) and k = y_{1} - mx_{1}. A point on this line with coordinates (x, mx+k) lies on the curve given by (2) iff (mx+k)^{2} = x^{3} + b'x + c'. Thus, x must be a solution of 0 = x^{3} -m^{2}x^{2} + (b' -2mk)x + c'-k^{2}. Since a cubic equation over the reals must have 1 or 3 real roots, and we know (because P and Q are on the curve) that x_{1} and x_{2} are both real roots, a unique third real root exists (i.e., x_{3}). As the sum of the roots of a cubic equation is the negative of the coefficient of the square term, we have that:

= m(x

While these formulas were derived over the field **R**, they remain valid (although the arguments are slightly different) for all fields except those of characteristic 2 or 3.

First observe that the formula (2) does not work well in characteristic 2. Consider the slope of the tangent lines to the curve calculated in the last section. Since 2 = 0, these slopes are all "infinite", i.e., all tangent lines pass through the point O (note the similarity to the knot of an oval in characteristic 2). To avoid this difficulty, we will work exclusively with elliptic curves of the form

With these modifications, the addition law is the same as in the other cases.

We can also derive the corresponding formulas for the coordinates of the sum of P and Q, points of Y^{2} + Y = X^{3} + aX + b. Let P = (x_{1},y_{1}), Q = (x_{2},y_{2}) and P+Q = (x_{3},y_{3}). The same analysis as in the real case gives us,

= m(x

Let X be the quadratic character of GF(q), that is the map X: -> {0,1,-1} defined by X(0) = 0 and X(u) = 1 if u is a non-zero square and X(u) = -1 if u is a non-square. The number of solutions of y^{2} = u in GF(q) is thus 1 + X(u) [2 if u is a non-zero square, 0 if u is a non-square and 1 if u = 0]. The number of points on our elliptic curve is thus

**Hasse's Theorem**: *Let N be the number of points on an elliptic curve over GF(q). Then*

Throughout this section we will assume that the finite field does not have characteristic 2 or 3 so that we may represent our elliptic curves in the form y^{2} = x^{3} + ax + b. This will make the underlying ideas easier to see, but in practice, computer implementations of these algorithms are usually much easier if the field has characteristic 2.

Suppose that we have written our plaintext message as a series of integers m, with 0 m < M. Choose a finite field GF(q), with q = p^{r}, p not 2 or 3, and q > Mk. We set up a bijection between the integers from 0 to Mk with a subset of elements of GF(q). An easy way to do this is to write the integer in it's p-adic form and associate the integer with the element of GF(q) that corresponds to the vector of coefficients of this form. That is, write s = a_{0} + a_{1}p + a_{2}p^{2} + ... + a_{r-1}p^{r-1} (each a_{i} in **Z**_{p}) and associate s with the field element (a_{0},a_{1},a_{2}, ..., a_{r-1}). Now, to each message unit m, we will associate all the integers of the form m_{j} = mk + j, with 0 j < k. Using the bijection we associate each m_{j} with a field element (and by abuse of notation we will also call the field element m_{j}), and calculate f(m_{j}) where f(x) = x^{3} + ax + b (the right hand side of the equation of our elliptic curve). We do this successively for each j until we obtain a square. When we get a square, we associate the message unit m to both of the points with first coordinate m_{j} on the elliptic curve. Since half of the elements of GF(q) are squares, the probability that we fail to find a square (and hence fail to associate m to a point) is about 1/2^{k}. Should failure occur for any m, we would start again with a different q. To decode a point, convert the x-coordinate of the point back to an integer m_{j} and then divide by k and drop the remainder, that is, m = [m_{j}/k], where [..] is the greatest integer function.

To send a message P_{m} (a point on E) to Bob, Alice chooses a random integer c and sends the pair of points (cA, P_{m} + c(b_{B}A)), where b_{B}A is Bob's published public point. Upon receiving this pair, Bob multiplies the first point by his secret integer b_{B} and subtracts it from the second point obtaining P_{m} + c(b_{B}A) - b_{B}(cA) = P_{m}. The security of the system lies in Oscar's inability to find b_{B} knowing only b_{B}A.

If Alice wishes to sign a message m (which may be the hash of a longer message) she first chooses a random integer k with 1 k < N and gcd(k,N) = 1. She computes the point R = kA = (x,y) and the integer s k^{-1}(m - a_{A}x) mod N. The signed message is then the triple (m,R,s). To verify this signature, Bob calculates V = x(a_{A}A) + sR and W = mA. He will declare the signature valid iff V = W. This follows from the calculation:

Lenstra's algorithm is an analogue of Pollard's p-1 method, but whereas Pollard's method can get stuck if p-1 has a large prime factor, the elliptic curve version can get around that problem by changing the curve. Given enough curves, you are almost guaranteed to find one that gives a factorization.

The elliptic curves used in this factoring algorithm are defined over the field of rationals, **Q**. Thus, we may assume that they have equations given by y^{2} = x^{3} + ax + b, where we will always take a and b to be integers. For any integer n, we consider the mod n reduction of such an elliptic curve. That is, if E is the elliptic curve and P = (x,y) a point of the curve, with integer coordinates, then the modulo n reduction of E, denoted by E mod n, contains the point P mod n = (x mod n, y mod n). Every time we compute a multiple of P, we will really only be concerned with the reduction of its coordinates mod n. The formulas we have developed for curves of this form are still valid when we reduce them mod n. There is only one caveat to treating the numbers in this way, and that is, whenever we calculate a division (such as in finding a slope of a line) we must have the divisor be relatively
prime to n. The calculation of the inverses to do the divisions is done with the extended Euclidean algorithm, so in doing the calculation, determining whether or not the divisor is relatively prime to n is automatic.

The essential idea behind the algorithm is very simple. Starting with a, possibly arbitrary, elliptic curve and any integer coordinate point P on it, calculate all integer multiples of P mod n up to a predetermined bound, where n is the number you wish to factor. At each step, there will be a division, and so, the gcd of the divisor and n will be calculated. If this gcd always turns out to be 1, you will be able to obtain the multiple of P mod n that you sought, and will have failed to factor n. If the gcd is ever not 1, then either it is n, and you can again stop since you have failed to factor n, or the gcd is not 1 or n, and therefore is a proper factor of n (success!!!). Whenever the method fails to produce a factor, just pick a new elliptic curve and point and start over.

The only real issues to settle are how to pick a supply of elliptic curves and points, and what a good bound for the computation would be. In selecting the elliptic curve, we could randomly select the integers a and b and then locate a point P on the curve, but this is not very efficient. Rather, select the point P = (x,y) first, then randomly pick an integer a and finally calculate b = y^{2} -x^{3} -ax to obtain the the elliptic curve which contains the point P. Since we need to have non-degenerate elliptic curves, we must check that the discriminant of the cubic function is not 0 (compare this to the quadratic function case). The discriminant of our cubic is 4a^{3} + 27b^{2}. Actually, we want to have that the discriminant is not 0 mod n, so we will first calculate the gcd(4a^{3} + 27b^{2}, n). If this gcd is n, then we must select another elliptic curve. If it is 1, we can continue and if it has any other value, we have found a factor of n and may stop. The bound is essentially the point at which we give up on any elliptic curve we are examining. If the bound is too small then we will be giving up too soon and will have to examine many elliptic curves before finding one that gives a factor. If the bound is too large, then the chance of finding a factor from any elliptic curve is much higher but the computational effort becomes too great. Thus the bound needs to be a compromise value, not too small and not too large. The right order of magnitude for the bound depends on the size of the expected factor (generally an unknown quantity) and so is often set experimentally.

The elliptic curve method seems to be best suited for factoring numbers of medium size, say around 40 or 50 decimal digits. These numbers are no longer used for the security of factoring-based systems such as RSA. For larger numbers, the quadratic sieve and number field sieve are superior.