Monday, November 25, 2013

The Sieve of Eratosthenes

The algorithm known as the Sieve of Eratosthenes solves the following abstract problem: Generate all of the primes less than a number n.
pseudocode taken from wiki:
Input: an integer n > 1
 
Let A be an array of Boolean values, indexed by integers 2 to n,
initially all set to true.
 
 for i = 2, 3, 4, ..., not exceeding √n:
  if A[i] is true:
    for j = i2, i2+i, i2+2i, ..., not exceeding n :
      A[j] := false
 
Now all i such that A[i] is true are prime.

The simple idea is this: 1.write out all numbers less than n 2.now starting with i=2 do the following: 3. cross out all multiples of i less than n 4. increment until you find another uncrossed number, 5. set i equal to it. and repeat Here's a java program that does this and just a little bit more:
import java.util.Scanner;

public class SieveOfEratosthenes {
  public static void main(String[] args) {
    Scanner input = new Scanner(System.in);
    System.out.print("Find all prime numbers <= n, enter n: ");
    int n = input.nextInt();
    
    boolean[] primes = new boolean[n + 1]; // Prime number sieve
    
    /* Initialize primes[i] to true; essentially true means unmarked and false means
     marked.
     */
    for (int i = 0; i < primes.length; i++) {
      primes[i] = true; 
    }
    
    for (int k = 2; k <= n / k; k++) {
      if (primes[k]) {
        for (int i = k; i <= n / k; i++) {
          primes[k * i] = false; // k * i is not prime
        }
      }
    }
    
    final int NUMBER_PER_LINE = 10; // Display 10 per line
    int count = 0; // Count the number of prime numbers found so far
    // Print prime numbers
    for (int i = 2; i < primes.length; i++) {
      if (primes[i]) {
        count++;
        if (count % 10 == 0) 
          System.out.printf("%7d\n", i);
        else
          System.out.printf("%7d", i);          
      }
    }
    
    System.out.println("\n" + count + 
      " prime(s) less than or equal to " + n);
  }
}
Analysis

 in wiki it says this: "Time complexity of calculating all primes below n is $$O(loglog(n))$$, a direct consequence of the fact that the prime harmonic series approaches loglog n." This just begs the question what is the "prime harmonic series". The harmonic series is this \(\sum_{k=1}^{\infty} \frac{1}{k}\). Here, the number k can be any number. But in the "prime" harmonic series, k can only be a prime and to emphasize that, the subscript used is p: \(\sum_{p=1}^{\infty}\frac{1}{p}\). Now what is the connection?
Well, assume, every line of code takes seconds to complete. Now if you ll notice line 17, for each prime, the algorithm sets primes[k*i] to false. That is done n/k-k+1 times. This leads to estimation of the time spent in that for loop:
\[
n/2 -2+1 + n/3/-3+1 + n/5 -5+1....(why is 4 skipped?)
\]
Since we are dealing with big O, we can ignore constants and thus we see
\[
n/2+n/3+n/5+...=n \sum_{p=2} \frac{1}{p}
\]where n is a prime.
To finish the proof that the Sieve of Eratosthenes takes O(n*log log n), you need a few more theoretical results which are in the wiki.
I really just wanted to emphasize the connection between that abstract and concrete.

No comments:

Post a Comment