Broad Network


Sieve of Eratosthenes for app.codility.com/programmers Explained in C++

Category of Article : Technology | Computers and Software | Algorithm

By: Chrysanthus Date Published: 5 Sep 2025

The Sieve of Eratosthenes is a very simple and popular technique for finding all the prime numbers in the range from 2 to a given number n. The algorithm takes its name from the process of sieving.

When a number is multiplied by an integer, the result is called a multiple. Multiplication is continuous addition. Examples:

Multiples of 2: 2+2=4; 4+2=6; 6+2=8; 8+2=10; 10+2=12; 12+2=14; 14+2=16; etc.

Multiples of 3: 3+3=6; 6+3=9; 9+3=12; 12+3=15; 15+3=18; 18+3=21; 21+3=24; etc.

Multiples of 4: 4+4=8; 8+4=12; 12+4=16; 16+4=20; 20+4=24; 24+4=28; 28+4=32; etc.

Multiples of 5: 5+5=10; 10+5=15; 15+5=20; 20+5=25; 25+5=30; 30+5=35; 35+5=40; etc.

Consider the following table, where n is 17, and multiples of 2, 3, 4, and 5 are indicated:

No’s

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

Mults.

of 2

 

 

2x2

 = 4

 

2x3

= 6

 

2x4

= 8

3x3

= 9

2x5

= 10

 

2x6

=12

 

2x7

=14

 

2x8

=16

 

Mults.

of 3

 

 

 

 

3x2

= 6

 

 

3x3

= 9

 

 

3x4

= 12

 

 

3x5

= 15

 

 

Mults.

of 4

 

 

 

 

 

 

4x2

= 8

 

 

 

4x3

= 12

 

 

 

4x4

= 16

 

Mults.

of 5

 

 

 

 

 

 

 

 

5x2

= 10

 

 

 

 

5x3

= 15

 

 

No’s

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

Note that the integers, 0 and 1 are not in the table, for No's (numbers). The blunt approach for Sieve of Eratosthenes, is that, beginning from 2, remove all the multiples of consecutive numbers. The last row (bottom) in the table has all the multiples that have been removed, for the given number of 17; their backgrounds are gray. All the numbers in the last row (bottom) with white backgrounds are prime numbers.

The multiples of 2, greater than 2 are 4, 6, 8, 10, 12, 14, 16.

The next consecutive number is 3.

The multiples of 3, greater than 3 are 6, 9, 12, 15.

The next consecutive number is 4, whose multiples and itself, have already been removed, because they are a subset of the multiples of 2 (after 2). Four itself had been removed, because it is a multiple of 2. It is still good to know the multiples of 4.

The multiples of 4, greater than 4 are 8, 12, 16.

The next consecutive number is 5, which happens to be a prime number, like 2 and 3.

The multiples of 5, greater than 5 are 10, 15. Ten which is a multiple of 5 had already been removed as a multiple of 2. Fifteen which is a multiple of 5 had already been removed as a multiple of 3.

In fact, given n = 17, the numbers for whose multiples had to be considered, had to be from 2 to 4. After 4, as a number considered, all the multiples had been removed. So considering the multiples of 5 was a waste of time. All this is because, the square root of 17 (i.e. 17) is 4.12 approximately. And so the integers to consider should not go above 17 = 4.12 approximately. In fact, the integers to consider are just 2 and 3, for n = 17, since the square root of 17 is not a whole number. If n were 16, a perfect square, then the integers to consider should have included 4, but should not have gone above 4, because 16 = 4, and 4 just happens to be a whole number (integer).

Note: The reader should not forget that the first multiple of a number is that number itself; e.g. 2x1 = 2, 3x1=3, 4x1=4

Slow Algorithm for Sieve of Eratosthenes

The Sieve of Eratosthenes is a technique for finding all the prime numbers in the range from 2 to a given number, n. The following table shows the number of operations in a brute force approach, for the given integer, n = 17. In this approach, all the numbers from 2 to 17 are considered, instead of considering the integers from 2 to 17 = 4.12 :

No’s

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

Mults.

of 2

2

 

4

 

6

 

8

 

10

 

12

 

14

 

16

 

Mults.

of 3

 

3

 

 

6

 

 

9

 

 

12

 

 

15

 

 

Mults.

of 4

 

 

4

 

 

 

8

 

 

 

12

 

 

 

16

 

Mults.

of 5

 

 

 

5

 

 

 

 

10

 

 

 

 

15

 

 

Mults.

of 6

 

 

 

 

6

 

 

 

 

 

12

 

 

 

 

 

Mults.

of 7

 

 

 

 

 

7

 

 

 

 

 

 

14

 

 

 

Mults.

of 8

 

 

 

 

 

 

8

 

 

 

 

 

 

 

16

 

Mults.

of 9

 

 

 

 

 

 

 

9

 

 

 

 

 

 

 

 

Mults.

of 10

 

 

 

 

 

 

 

 

10

 

 

 

 

 

 

 

Mults.

of 11

 

 

 

 

 

 

 

 

 

11

 

 

 

 

 

 

Mults.

of 12

 

 

 

 

 

 

 

 

 

 

12

 

 

 

 

 

Mults.

of 13

 

 

 

 

 

 

 

 

 

 

 

13

 

 

 

 

Mults.

of 14

 

 

 

 

 

 

 

 

 

 

 

 

14

 

 

 

Mults.

of 15

 

 

 

 

 

 

 

 

 

 

 

 

 

15

 

 

Mults.

of 16

 

 

 

 

 

 

 

 

 

 

 

 

 

 

16

 

Mults.

of 17

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

17

No’s

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

In the table, all the spaces between multiples and at the right end of each row, are operations, as they will be iterated over. Of course, the multiples too are operations. So the total number of operations in the table, beginning from the top, are 16 + 15 + 14 + 13 + 12 + 11 + 10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 = 136.

162 = 256. 16xlog2(16) = 64. The number of operations above, is greater than n.log2n operations and less than n2 operations. So the time complexity for the brute force approach of the table above is O(n2); maximum possible number of operations is chosen for time complexity. The C++ program for this is:

    #include <iostream>
    #include <vector>
    #include <queue>
    using namespace std;
    
    queue<int> slowSieve(int n) {
        vector<int> allInts(n+1, 0);
        queue<int> qu;
        int noOperations = 0;

        for (int i=2; i <= n; i++) {
            int multiplier = 2;
            for (int j=i+1; j<=n; j++) {
                if (i * multiplier == j) {
                    allInts[j] = -1;
                    multiplier++;
                }
                noOperations++;
            }
        }
        
        for (int i=2; i <= n; i++) {
            if (allInts[i] != -1)
                qu.push(i);
        }

        cout << "No. of operations: "<< noOperations << endl;
        return qu;
    }

    int main(int argc, char **argv)
    {
        queue<int> que = slowSieve(17);
        
        while (que.size() > 0) {
            int prime = que.front(); que.pop();
            cout << prime << ", ";
        }
        cout << endl;

        return 0;
    }

The C++ iostream, vector and queue libraries have to be included. Always use the standard name-space (using namespace std;) unless there is a good reason not to do so. The function of interest, slowSieve begins with the declaration of the vector, allInts. Each index of allInts corresponds to an integer from 1 to n. The number of elements in allInts is (n+1) to account for 0, since index counting of a vector begins from 0. Note that the integers 0 and 1, are not taken into account in the algorithm. The vector, allInts is initialized to 0, for all its elements. A queue, qu to receive the prime numbers is declared.

The algorithm is actually performed by the two for-loops, where one is nested. The cell for the index of the vector, allInts, that is a multiple of a previous number from 1 to n (inclusive), is given the value, -1.

The separate for-loop pushes the index of the vector, allInts whose corresponding value is not -1, to the queue. The elements of the queue are the prime numbers. The output is:

    No. of operations: 120

    2, 3, 5, 7, 11, 13, 17,

The number of operations is 120 and not 136, because the starting numbers whose multiples were looked for, were not checked.

The n number of operations used to initialize the allInts vector to each cell of zero, and the n number of operations to push the prime numbers to the queue, are not included in the official time complexity of O(n2). Even if they are included, it will not make much difference. For example, “n” here is sixteen. 16 + 16 + 120, which is the number of operations in the for-loops, gives 154, which is not up to 256 = 162, though it is higher than n.log2(n) = 16.log2(16) = 64, in this case. Despite the addition, the official time complexity is O(162), in this case.

It is possible to have a faster algorithm as illustrated in the following section:

O(n log log n) Time Complexity for Sieve of Eratosthenes

First, it is good to know what O[n.log2(log2n)] means. Assume that the limit of a certain number of operations, is 32. Let n be 16. 

        32 = 16 x 2 
But 2 = log2(4)
 ∴   32 = 16 x log2(4)
And 4 = log2(16)
∴    32 = 16 x log2(log2(16)) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -(substituting)

Also,

        768 = 256 x 3 
But 3 = log2(8)
 ∴   768 = 256 x log2(8)
And 8 = log2(256)    
∴    768 = 256 x log2(log2(256)) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -(substituting)

The Sieve of Eratosthenes is a technique for finding all the prime numbers in the range from 2 to a given number, n. In the above algorithm, multiples of 2 were first removed (crossed out). Multiples of 3 were next removed (cells in vector, sieve made false as -1). Multiples of 4 where next removed (crossed out). Multiples of 5 were next removed. Multiples of 6 were next removed (crossed out); and so on. This is quite a long process. 

Considering n given as 17, with 16 numbers beginning from 2 (two and seventeen included):

the multiples of 2 after 2 are: 4, 6, 8, 10, 12, 14, 16.

The multiples of 3 after 3 are: 6, 9, 12, 15.

In the above algorithm, 6 and 12 had already been crossed out (cells in vector, allInts given -1), when the multiples of 2 were crossed out. So there was no need to waste an operation to cross out, 6 and waste another operation to cross out 12. Out of 16 numbers, all the numbers crossed out so far are:

4, 6, 8, 9, 10, 12, 14, 15, 16.

The multiples of 4 after 4 are: 8, 12, 16.

All these numbers, including 4 had already been crossed out, when the multiples of 2 were crossed out. So there was no need wasting operations crossing out 8, 12 and 16. There was also no need wasting an operation to cross out 4, which is less than √17 = 4.12 approximately (though 4 in particular, has to be crossed out in the fast algorithm below).

The square root of 17 is 4.12. From the whole list, the numbers (indexes) remaining that were not crossed out are:

    2, 3, 5, 7, 11, 13, 17

These are already the answers. They are already the prime numbers being looked for. So the rest of crossing out are wasting of operations. This means that after √n, the function just has to read through the rest of the sieve vector.

If at the beginning, all the elements of the sieve vector are initialized to true, and then all the crossed out cells are given the value of false, then the indexes from 2 onward, with the value true, are the prime numbers. That is the fast algorithm.

Determining the Time Complexity

Above, the number of elements is 16 and not 17, because there are actually 16 integers, since 0 and 1 are not included. The algorithm begins with 2. Two is a prime number. Multiples of 2 (even numbers), greater than 2, are crossed out, in approximately n/2 operations. With n effectively 16, these are 16/2 = 8 operations. 3 is a prime number. Next, multiples of 3 greater than 3 are crossed out, in n/3 operations. With n effectively 16, these are 16/3 = 5 operations approximately.

The next number to check would be 4. Four is a multiple of 2 and has already been crossed out. All multiples of 4 are multiples of the prime number 2, and have already been crossed out. The square root of 17 is 4.12. As soon as the square root is passed, (n - √n) scan operations have to be made, to pick up the rest of the prime numbers that have not been crossed out (not given false in the vector). Note that for n = 17, one operation has to be used to cross out 4, which had already been crossed out.

So, a fast algorithm to obtain all the prime numbers from 2 to n, begins with a vector called sieve, where all the elements are initialized to true. Beginning from 2, cross out all the multiples of prime numbers (do not cross out any first prime number); until the square root of n (√n) is reached. From there, the function should just continue to obtain the prime numbers remaining.

Crossing out a number, means giving false, to the cell value whose index is the number. The indexes for the whole list, whose values have not been crossed out, are read out as the prime numbers. The time to initialized the sieve vector elements with true, is not included in the official time complexity. The time to read out the prime numbers is also not included in the official time complexity.

The algorithm of O(n log log n) obtained, is much faster than an algorithm of (n2).

With n effectively 16, the number of operations are 8 +  5 + (16 - √16) = 8 + 5 + 12 = 25. So, a good number for the upper limit, for the number of operations, for this, should be 32, for O(n log log n) time complexity.

The sieve array (vector) becomes:

primes

true

true

false

true

false

true

false

false

false

true

false

True

false

false

false

true

index

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

The values for indexes 0 and 1 are false and false; though they have not been included in this table.

Fast Function for Sieve of Eratosthenes

The following program shows a fast function for Sieve of Eratosthenes (read through the code and comments):

    #include <iostream>
    #include <vector>
    #include <queue>
    using namespace std;
    
    queue<int> sieve(int n) {
        vector<bool> sieve(n+1, true);
        sieve[0] = false; sieve[1] = false; 
        queue<int> primes;
        int noOperations = 0;

        for (int i=2; i*i <= n; i++) {  //up to square root of n
            if (sieve[i] == true) {
                for (int j = i * i; j <= n; j = j+i) {  //all multiples up to n beginning from square of i, even before square root of n
                    sieve[j] = false;
                }
            }
        }

        for (int i=2; i <= n; i++) {
            if (sieve[i] != false)
                primes.push(i);
        }

        return primes;
    }

    int main(int argc, char **argv)
    {
        queue<int> que = sieve(17);
        
        while (que.size() > 0) {
            int prime = que.front(); que.pop();
            cout << prime << ", ";
        }
        cout << endl;

        return 0;
    }

The sieve() function iterates from 2 to the square root of n. For each prime number between 2 and √n (inclusive), all its multiples up to n are crossed out. This includes multiples below √n (such as 4 above) and including √n, if √n is an integer. There are two for-loops in question: the outer for-loop iterates up to √n, one-by-one; the inner for-loop iterates for each prime number, in multiples of the prime number, up to n. There is no need for the outer for-loop to iterate one-by-one beyond √n; because by the time it reaches √n, all the multiples in the list have been crossed out.

The first and second elements of the sieve vector are initialized to false, for the sake of consistency.

Observe the contents of the parentheses of the for-loops in question.

Note: the n+1 number of cells for the sieve vector provides for the extra one cell for index 0, since n means n cells from 1 to n.

Factorization

Factorization is the process of decomposition into prime factors.

4 = 2 x 2;    the prime factors are 2, 2

6 = 2 x 3;    the prime factors are 2, 3

8 = 2 x 2 x 2;    the prime factors are 2, 2, 2

9 = 3 x 3;    the prime factors are 3, 3

10 = 2 x 5;    the prime factors are 2, 5

12 = 2 x 2 x 3;    the prime factors are 2, 2, 3

15 = 3 x 5;    the prime factors are 3, 5

16 = 2 x 2 x 2 x 2;    the prime factors are 2, 2, 2, 2

18 = 2 x 3 x 3;    the prime factors are 2, 3, 3

20 = 2 x 2 x 5;    the prime factors are 2, 2, 5

60 = 2 x 2 x 3 x 5;    the prime factors are 2, 2, 3, 5

108 = 2 x 2 x 3 x 3 x 3;    the prime factors are 2, 2, 3, 3, 3

240 = 2 x 2 x 2 x 2 x 3 x 5;    the prime factors are 2, 2, 2, 2, 3, 5

and so on.

n = 20 will be used in this section of the lesson (tutorial) to develop the fast algorithm to obtain all the factorization prime factors of a given number, n (this includes the repeated factors). Here, n is not a prime number.

Smart Way to factorize

The mathematics division statement is:

    Dividend ÷ Divisor = Quotient

To factorize a number (e.g. 20) by programming (a fast way), there are two stages. The first stage produces an array (vector) with indexes from 2 to n. This array, has only the prime factors of the number, n. The second stage does divisions. It begins by dividing n by one of its prime factors. Next it divides the quotient from the previous division with the same prime factor or some other prime factor. This repeats until the divisor is 0. In that way, all the prime factors for n are obtained - see details below.

Producing the Factors Array

The array produced by the first stage, is done in a similar way, that the sieve array is produced. However, instead of having false in positions of multiples, the prime factors that caused the multiples are present there, from the square of an early prime factor. There are zeros in the other cells, including the cells for the early prime factor indexes. This array can be called the divisors array, because it has the possible divisors for the divisions of the next (second) stage.

When n = 20, the factors array is as follows:

factors

0

0

2

0

2

0

2

3

2

0

2

0

2

3

2

0

2

0

2

index

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

2 is a prime factor less than 20. 2 is an early prime factor. Its cell has 0. The square of 2 is 4. The rest of the cells for the multiples of 2, beginning from the square of 2, which is 4, have 2: cell 4 has 2, cell 6 has 2, cell 8 has 2, cell 10 has 2, cell 12 has 2, cell 14 has 2, cell 16 has 2, cell 18 has 2, cell 20 has 2.

3 is a prime factor less than 20. It is an early prime factor. Its cell has 0. The square of 3 is 9. The rest of the cells for the multiples of 3, beginning from the square of 3, which is 9, have 3: cell 9 has 3, cell 12 should have had 3 but 2 is already there, cell 15 has 3, cell 18 should have had 3 but 2 is already there.

The algorithm to produce the divisors array has O[n.log2(log2n)] time complexity. The function for this is:

    vector<int> divisorsFn(int n) {
        vector<int> divisors(n+1, 0);
        divisors[0] = 0; divisors[1] = 0; 

        for (int i=2; i*i <= n; i++) {  //up to square root of n
            if (divisors[i] == 0) {
                for (int j = i * i; j <= n; j = j+i) {  //all multiples up to n
                    if (divisors[j] == 0)
                        divisors[j] = i;
                }
            }
        }

        return divisors;
    }

The output is:

        0, 0, 0, 0, 2, 0, 2, 0, 2, 3, 2, 0, 2, 0, 2, 3, 2, 0, 2, 0, 2,

as expected. The C++ main() function that produced the output has not been shown. This vector consists of the possible divisors and zeros.

Determining the Prime Factors for n

This second stage is the effective factorization stage. It uses the above divisors array. It begins by dividing n by one of its prime factors. Next it divides the quotient from the previous division with the same prime factor or some other prime factor. This repeats until the divisor is 0. In that way, all the factorization prime factors for n are obtained. Details are as follows:

When n equals 20, twenty is the index. The factorization function will check if the cell value for the index, 20 is greater than zero (is not 0). In this case, it is 2 and greater than 0. If it were zero, it would have meant that 20 itself is a prime factor. This index 20 is divided by the 2, to give 10. This 2 is the first prime factor for twenty. The quotient is 10. The function will then go on to the index of 10, the quotient. There it will check if the cell value for the index, 10 is greater than 0. In this case it is greater than 0 with a value of 2. The function divides 10 by this new 2 to have 5. This 2 is the second prime factor of twenty. 5 is the quotient. The function will then go on to the index of 5, the quotient. There it will check if the cell value for the index, 5 is greater than 0. At this point, it is not greater than 0; it is actually 0. The algorithm stops here. At this last point, 5 and not the cell value for the index, 5 is the next and last prime factor of 20. So, all the prime factors for 20 are 2, 2, and 5, and correct, because 20 = 2 x 2 x 5.

This algorithm has O(log2n) time complexity. The function for this algorithm is:

    vector<int> factorization (int n, vector<int> F) {
        vector<int> primeFactors;
        while (F[n] > 0) {
            primeFactors.push_back(F[n]);
            n = n / F[n];
        }
        primeFactors.push_back(n);
        
        return primeFactors;
    } 

Here, F is the divisors vector returned by the previous function.

Both functions working together gives a time complexity of O[n.log2(log2n) + log2n] . A program that employs both functions is as follows:

    #include <iostream>
    #include <vector>
    using namespace std;
    
    vector<int> divisorsFn(int n) {
        vector<int> divisors(n+1, 0);
        divisors[0] = 0; divisors[1] = 0; 
        int noOperations = 0;

        for (int i=2; i*i <= n; i++) {  //up to square root of n
            if (divisors[i] == 0) {

                for (int j = i * i; j <= n; j = j+i) {  //all primes up to n
                    if (divisors[j] == 0)
                        divisors[j] = i;
                }
            }
        }

        return divisors;
    }
    
    vector<int> factorization (int n, vector<int> F) {
        vector<int> primeFactors;
        while (F[n] > 0) {
            primeFactors.push_back(F[n]);
            n = n / F[n];    //division or previous quotient and new value
        }
        primeFactors.push_back(n);
        
        return primeFactors;
    }

    int main(int argc, char **argv)
    {
        vector<int> F1 = divisorsFn(20);
        vector<int> vtr1 = factorization(20, F1);
        for (int i=0; i < vtr1.size(); i++) {
            cout << vtr1[i] << ", ";
        }
        cout << endl;
        
        vector<int> F2 = divisorsFn(240);
        vector<int> vtr2 = factorization(240, F2);
        for (int i=0; i < vtr2.size(); i++) {
            cout << vtr2[i] << ", ";
        }
        cout << endl;

        return 0;
    }

Note how the divisors vector has been used as an argument for the factorization() function. The output is:

    2, 2, 5,

    5, 3, 2, 2, 2, 2,

The first line gives the prime factors for 20 and the second line gives the prime factors for 240. The prime factors for a number do not have to appear in ascending order (with this code).

Conclusion

In order to produce the set of prime numbers fast, for a given integer, n, an array called sieve has to be produced, with all values initialized to true. The indexes of the array are from 0 to n. Set the values of sieve[0] and sieve[1] to false. Iterate through the array from index 2 to the square root of index n (inclusive). For each iteration, give the value of false to the multiples in the whole array, greater than x1 of the prime numbers between 2 and the square root of n (inclusive). At this point, any index with the value true, is a prime number between 2 and n (inclusive). The indexes of this array with value, true, can now be read onto a queue. The time complexity for this is O[n.log2(log2n)].

In order to factorize fast, the factorization function has to use a modified sieve array. The modified sieve array, called divisors above, has as values, the prime numbers, less than or equal to the square root of n, in positions of their multiples greater than or equal to prime2, in the whole array.

The factorization process is done in two stages: The array produced by the first stage, is done in a similar way that the sieve array is produced. However, instead of having false in positions of multiples, there present, is the prime factor that caused its multiples, from the square of the prime factor. There are zeros in the other cells, including the cells for the prime factor indexes. This array can be called the divisors array, because it has the possible divisors for the divisions of the next (second) stage. n does not have to be a prime number. The time complexity for this first stage is O[n.log2(log2n)].

The second stage does divisions. This second stage is the effective factorization. It uses the above divisors array. It begins by dividing n by one of its prime factors. Next it divides the quotient from the division with the same prime factor or some other prime factor. This repeats until the divisor is 0. In that way, all the prime factors for n are obtained. The time complexity for this second stage is O(log2n). The time complexity for the first and second stages together is: O[n.log2(log2n) + log2n] .

The rest of the 17 lessons, with explanation of lessons, explanation of tasks, and explanation of solutions, can be found at (click): Explanations in C++ for the 17 Algorithm Lessons at https://app.codility.com/programmers with Problems and Solutions





Related Links

More Related Links

Cousins

NEXT

Comments