*Pf*: Write n = m_{1}m_{2} with m_{1}m_{2} (in a worst case, m_{1} = 1 will work). Since n is odd, each of m_{1} and m_{2} are odd. Let a = ½(m_{1} + m_{2}) and b = ½(m_{2} - m_{1}). Note that a and b are integers. Then m_{1} = a-b and m_{2} = a+b, so n = m_{1}m_{2} = (a-b)(a+b) = a^{2} - b^{2}.

Now suppose we wish to find a factor of the odd integer n (> 1). Examine, in turn, the numbers n, n+1^{2}, n+2^{2}, n+3^{2}, ... until you find a square (this is guaranteed to exist by the theorem), say n + b^{2} = a^{2}, then n = a^{2} - b^{2} = (a+b)(a-b) and so, factors of n have been located.

**Example:** Find a factor of n = 152398989. Looking for a square in the sequence, 152398989, 152398990, 152398993, 152398998, 152399005, 152399014, 152399025, 152399038, ... we have (12344.998541...)^{2}, (12344.99582...)^{2}, (12344.99870...)^{2}, (12344.99890...)^{2}, (12344.99918...)^{2}, (12344.99955..)^{2}, **(12345) ^{2}**. Thus, n = (12345)

The method can be sped up a bit by observing that the last digit of a square must be a 0,1,4,5,6 or 9. However, taking square roots to determine if a number is a square is a slow operation, and this naive approach is therefore not very fast. A better algorithm to search for squares would be to examine the sequence of integers given by *([k] + i) ^{2} - n* for a square, where

Factors which are nearly equal will be found fairly quickly by this procedure, thus in the RSA application one must make sure that the two primes are not too close together.

**Example**: Let n = 15770708441. Choose B = 180. Then a = 11620221425 and we compute d = 135979. We get the factorization 15770708441 = (135979)(115979). The reason that factorization worked is that d-1 = 135978 = 2(3)(131)(173) has only small prime factors. Any B 173 would have worked for this n.

The choice of B is crucial in this algorithm. If B is small, the algorithm will run quickly, but the chance of success is small. On the other hand, if B is large, the algorithm will find a factor, but the runtime will be prohibitively slow (comparable to trial division).

In the RSA application, one must ensure that the primes p and q have the property that p-1 and q-1 have at least one large prime factor to avoid an attack by this method. We shall see a generalization of this method later when we consider elliptic curves.

**Example:** Suppose we want to factor n = 4633. If we notice that 118^{2}25 = 5^{2} mod 4633, then a = gcd(118+5, 4633) = gcd(123, 4633) = 41 and we have 4633 = (41)(113).

The factoring problem is then reduced to finding a congruence of this type. To manufacture such a congruence we use the concept of a factor base. A ** factor base** is simply a set of small primes which is not too big. If B is a factor base, then a number all of whose prime factors lie in B is said to be

**Example**: Factor n = 2043221 using the factor base B = {2,3,5,7,11}. We find, by means to be discussed below, the following B-smooth squares:

2878

3197

3199

3253

Consider the 3rd and 4th numbers; we see that [(3197)(3199)]^{2}2^{8} 3^{8} 7^{2} mod 2043221. Thus, t = (3197)(3199) mod 2043221 = 11098 and s = 2^{4} 3^{4} 7 mod 2043221 = 9072. Now gcd(t+s,n) = gcd(11098+9072, 2043221) = 2017 and we have 2043221 = (2017)(1013).

This example also illustrates what can go wrong with the procedure. Had we taken the first two numbers, we would have obtained [(1439)(2878)]^{2} 2^{6} 5^{8} 11^{2} mod n, so t = (1439)(2878) mod n = 55000 and s = 2^{3} 5^{4} 11 mod n = 55000, i.e. t = s, and this does not lead to a factorization.

In the example we found the appropriate subset of b_{i}'s to multiply by inspection, but we can do this systematically and at the same time answer the question of how many b_{i}'s do we need to find? We form a 0-1 matrix where each row corresponds to one of the B-smooth squares, having |B| columns, each column corresponding to one prime in the factor base B. For each row the entry in the jth column is a 1 if the jth prime of B appears to an odd power and 0 otherwise. For the last example this matrix would look like:

0 0 0 0 1

0 0 0 0 1

1 1 0 0 0

1 1 0 0 0

0 1 0 1 0.

It should now be clear why we are a little vague in the definition of a factor base. If the factor base is small, we will need to only a few B-smooth squares to get a linear dependency, however, having a small factor base means that the B-smooth squares are rare and so finding them will be hard. On the other hand, a large factor base means that there are many more B-smooth squares, so they will be easier to find, but we will then need to find many more of them. A good algorithm based on these considerations would therefore be one for which the factor base is not too big and which has an efficient way of finding B-smooth squares.

One could try randomly selecting the b_{i} and if n is not too large this will be effective, but for large n it isn't. A more effective procedure would be to select the b_{i}'s to be integers near the square root of kn for different choices of k. The squares of these b_{i}'s will be near kn, so, when reduced mod n they should be small and thus made up of only small primes. Another procedure, due to Pomerance, is to start with a large interval of integers around the square root of n, and then systematically remove integers based on a quadratic relationship with each prime in the factor base. The remaining integers have a high probability of being B-smooth. This method is known as the *Quadratic Sieve*. A more recent algorithm, known as the *Number Field Sieve* finds the B-smooth squares by means of computations in rings of algebraic integers.

For factoring RSA moduli, the quadratic sieve has been the most successful algorithm. In April 1994, a 129-digit number known as RSA-129 was factored by Atkins, Graff, Lenstra and Leyland using the quadratic sieve. The numbers RSA-100, RSA-110, ..., RSA-500 were a list of RSA moduli publicized on the Internet (RSA Labs) as "challenge" numbers for factoring algorithms. Each number RSA-d was a d-digit number that is the product of two primes of approximately the same length. The numbers RSA-100, RSA-110, RSA-120, RSA-129, RSA-130, RSA-140, RSA-155 and RSA-160 have all been factored (the last of these on April 1, 2003). In 2001, RSA Labs renamed and reissued the "challenge" numbers and assigned specific monetary rewards for their factoring. The new list (available at RSA Labs) uses the number of digits in the binary representation in the name, starting at RSA-576 (worth $10K) and going up to RSA-2048 ($200K).

The number field sieve seems to have great potential since its asymptotic running time is faster than other known algorithms. It is still in the developmental stages, but many researchers feel that it might prove to be faster for numbers having more than about 125-130 digits. In 1990, the number field sieve was used by Lenstra, Lenstra, Manasse and Pollard to factor 2^{512} + 1. On December 3, 2003 the factoring of RSA-576 (174 digits) was announced by a group at the German Federal Agency for Information Technology Security (BIS). They used a number field sieve to obtain the two 87-digit prime factors. The smallest challenge number is now RSA-640 worth $20K$.