CountSemiprimes at app.codility.com/programmers in C++ Explained: Count the semiprime numbers in the given range [a..b].
Task score : 100% - Correctness : 100% ; Performance : 100%
Detected time complexity : O(N * log(log(N)) + M)
Lesson 11 at app.codility.com/programmers
The solution and its explanation are given below.
Category of Article : Technology | Computers and Software | Algorithm
By: Chrysanthus Date Published: 5 Sep 2025
Problem
A prime is a positive integer X that has exactly two distinct divisors: 1 and X. The first few prime integers are 2, 3, 5, 7, 11 and 13.
A semiprime is a natural number that is the product of two (not necessarily distinct) prime numbers. The first few semiprimes are 4, 6, 9, 10, 14, 15, 21, 22, 25, 26.
You are given two non-empty arrays P and Q, each consisting of M integers. These arrays represent queries about the number of semiprimes within specified ranges.
Query K requires you to find the number of semiprimes within the range (P[K], Q[K]), where 1 <= P[K] <= Q[K] <= N.
For example, consider an integer N = 26 and arrays P, Q such that:
P[0] = 1 Q[0] = 26
P[1] = 4 Q[1] = 10
P[2] = 16 Q[2] = 20
The number of semiprimes within each of these ranges is as follows:
· (1, 26) is 10,
· (4, 10) is 4,
· (16, 20) is 0.
Write a function:
vector<int> solution(int N, vector<int> &P, vector<int> &Q);
that, given an integer N and two non-empty arrays P and Q consisting of M integers, returns an array consisting of M elements specifying the consecutive answers to all the queries.
For example, given an integer N = 26 and arrays P, Q such that:
P[0] = 1 Q[0] = 26
P[1] = 4 Q[1] = 10
P[2] = 16 Q[2] = 20
the function should return the values [10, 4, 0], as explained above.
Write an efficient algorithm for the following assumptions:
· N is an integer within the range [1..50,000];
· M is an integer within the range [1..30,000];
· each element of arrays P and Q is an integer within the range [1..N];
· P[i] ≤ Q[i].
Note:
In this lesson, a divisor is a factor.
The first ten semiprimes and first ten composite numbers are:
Semiprimes |
4 |
6 |
9 |
10 |
14 |
15 |
21 |
22 |
25 |
26 |
Composite Numbers |
4 |
6 |
8 |
9 |
10 |
12 |
14 |
15 |
16 |
18 |
Once a number is not a prime number, it is a composite number. A semiprime is a composite number, but a composite number is not necessarily a semiprime number. In other words, semiprime numbers form a subset of composite numbers.
A semiprime number is the product of at least two prime numbers, while a composite number is the product of two or more prime numbers. 12 is a composite number but it is not a semiprime. 12 = 2 x 2 x 3.
The lesson for this problem is on Sieve of Eratosthenes. A section of the lesson explains how to produce the factors (divisors) vector (list). In the factors vector, if the value of an index is not 0, then that value is the smallest of its factors (so far). Remember that with the factors vector, the index is the dividend. A composite index would divide that smallest factor to give a quotient that might divide its own smallest factor down in the factors vector (without a remainder). However, a semiprime index would divide its smallest factor, and its quotient would not be able to divide its own smallest factor, which unfortunately is 0 in the table (and division by 0 is not allowed). In other words, the quotient here is a prime number and cannot divide any further.
Strategy
The given data list is used to explain the strategy for the solution. First, with N=26, the factors vector is created as follows:
f |
0 |
0 |
0 |
0 |
2 |
0 |
2 |
0 |
2 |
3 |
2 |
0 |
2 |
0 |
2 |
3 |
2 |
0 |
2 |
0 |
2 |
3 |
2 |
0 |
2 |
5 |
2 |
i |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
There is a function, different from the solution function, to create this factors vector.
Next, in one scan of the whole factors vector, the semiprimes are determined, while the accumulating number (prefix sums) of semiprimes is counted, one-by-one. The following table shows the factors data, indications(c) of the semiprimes, and the prefix sums (without the preceding 0) of the indicators.
f |
0 |
0 |
0 |
0 |
2 |
0 |
2 |
0 |
2 |
3 |
2 |
0 |
2 |
0 |
2 |
3 |
2 |
0 |
2 |
0 |
2 |
3 |
2 |
0 |
2 |
5 |
2 |
i |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
c |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
1 |
1 |
p |
0 |
0 |
0 |
0 |
1 |
1 |
2 |
2 |
2 |
3 |
4 |
4 |
4 |
4 |
5 |
6 |
6 |
6 |
6 |
6 |
6 |
7 |
8 |
8 |
8 |
9 |
10 |
If an index is a semiprime, the cell for its indicator, c has 1, otherwise the cell has 0.
The indicator vector, c, and prefix sums vector for the indicators, p and the determination of semiprimes, are all done in the one scan, in order to reduce the number of main operations.
Note that prefix summing here, is actually,
p[i] = p[i-1] + c[i];
without the preceding zero for prefix sums. The query is answered for each pair of P[i] and Q[i], by doing prefix sums slice (subtraction). Remember that prefix sums slice, subtracts the sum at the beginning of the range, from the sum just above the range. However, prefix sums slice here is actually:
result[i] = p[Q[i]] - p[P[i]-1]; (see below)
That is: subtract the sum just below the beginning of the slice, from the sum at the last position of the slice.
Smart Solution
A solution() function is in the following program (read the code and comments):
#include <iostream>
#include <vector>
using namespace std;
vector<int> factorsV(int n) {
vector<int> divisors(n+1, 0);
divisors[0] = 0; divisors[1] = 0;
for (int i=2; i*i <= n; i++) { //up to square root of n
if (divisors[i] == 0) {
for (int j = i * i; j <= n; j = j+i) { //all multiples up to n
if (divisors[j] == 0)
divisors[j] = i;
}
}
}
return divisors; //vector of smallest prime factor for each index
}
vector<int> solution(int N, vector<int> &P, vector<int> &Q) {
vector<int> F = factorsV(N);
vector<int> c(N+1, 0);
vector<int> p(N+1, 0);
for (int i=4; i<N+1; i++) {//checks if no. divides prime and if quotient is prime
if (F[i] != 0) {
int z = i / F[i];
if (F[z] == 0)
c[i] = 1;
}
p[i] = p[i-1] + c[i];
}
vector<int> result(P.size(), 0);
for (unsigned int i=0; i<P.size(); i++) {
result[i] = p[Q[i]] - p[P[i]-1];
}
return result;
}
int main(int argc, char **argv)
{
vector<int> P{1, 4, 16};
vector<int> Q{26, 10, 20};
vector<int> ret = solution(26, P, Q);
cout << ret[0] <<' ' << ret[1] <<' ' << ret[2] << endl;
return 0;
}
The output is:
10 4 0
The +1 in N+1, is for 0, which cannot be a divisor (division of 0 is not allowed).
The main for-loop begins at i=4, because semiprimes begin at 4 (composite numbers also begin at 4).
At app.codility.com/programmers the scoring and detected time complexity for this solution() is:
Task score : 100% - Correctness : 100% ; Performance : 100%
Detected time complexity : O(N * log(log(N)) + M)
Note that the solution to this problem does not only involve what is in the lesson for this problem; it also involves prefix sums (in a special way), which was studied in lesson five. The prefix sum is done with the indicators.
Though the c and p vectors are N+1 long, the +1 is not a preceding zero and it is not the convenience zero (for counting sort). It is for the zero of zero-based-indexing.