Broad Network


MissingInteger at app.codility.com/programmers in C Explained: Find the smallest positive integer that does not occur in a given sequence.

Task score : 100% - Correctness : 100% ; Performance : 100%
Detected time complexity : O(N) or O(N.log(N))
Lesson 4 at app.codility.com/programmers
The solution and its explanation are given below.

Category of Article : Technology | Computers and Software | Algorithm

By: Chrysanthus Date Published: 28 May 2025

Problem

This is a demo task.

Write a function:

    int solution(int A[], int N);

that, given a array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A.

For example, given A = [1, 3, 6, 4, 1, 2], the function should return 5.

Given A = [1, 2, 3], the function should return 4.

Given A = [−1, −3], the function should return 1.

Write an efficient algorithm for the following assumptions:

        N is an integer within the range [1..100,000];
        each element of array A is an integer within the range [−1,000,000..1,000,000].

Note:

One of the given arrays above is:

    [1, 3, 6, 4, 1, 2]

If this array is sorted, it would be:

    [1, 1, 2, 3, 4, 6]

It can clearly be seen that the missing positive integer greater than 0 is 5. If 5 were included, then the missing integer would be 7 = 6+1, according to the above criteria.

After a possible sort, if there are negative integers, then the search has to begin from 1, excluding 0.

The lesson for this problem is Counting Elements. Counting elements is also a form of sorting. This should give the candidate the idea, that a count array should be used, and the missing integer, searched from count index 1, which is just after count index 0. The missing integer (count index), should have a value (number of occurrences) of 0.

If the missing integer is not found, then the missing integer should be taken as the maximum number in the array, A, plus 1; according to the above criteria.

Illustration with given Example Data

When the given array, A is

Value:136412
Index:012345

then the count array is:

No. of
Occurrences:
0211101
Index:0123456

The number of occurrences of the value 5 in A is 0 in the count array. 5 is the missing integer in A. The number of elements in A here, is approximately the number of elements in the count array. This gives a time complexity of O(N+N). Remember that time complexity is maximum possible number of operations.

When the given array, A is:

Value:123
Index:012

then the count array is:

No. of
Occurrences:
0111
Index:0123

Remember that the indexes of the count array are the values of the given array. There may be a convenience 0, which is not found in the given array, as in this case. There is no index with number of occurrences of 0 in this case (the convenience 0 index is not considered).

The indexes to consider in the count array, are in the range, 1 to 3 (inclusive). The maximum integer in A is 3. According to the above criteria, the answer has to be 4 = 3 + 1. In the count array, 4 could have been the next index after the highest index of 3.

The time complexity for this algorithm is O(N+N): N time to scan the given array, A and produce the count array; and N time to “read” through the count array.

When the given array, A is [−1, −3], the smallest positive integer is taken (considered) as +1, because all the elements in A are negative. If the count array is to be produced from this given array, based on possible positive integers (values) in the given array, then the count array will be:

No. of Occurrences:0
Index:0

where the index of 0 here is the convenience 0.

Since there is no index from 1 upwards, in the count array, with 0 as number of occurrences, then there is supposedly no missing smallest positive integer within A, in the range, 1 to infinity (inclusive). 1 could have been the next index after the highest index of 0 here, in the count array. So the answer is taken (considered) as 1 = 0 + 1; according to the above criteria (If the missing integer is not found, then the missing integer should be taken as the maximum number in the array, A, plus 1. In this case, maximum value should be considered as 0).

The number of elements (length of given array) in the given array, [−1, −3] is 2. The algorithm takes approximately 2 = N operations to scan array, A to produce the count array (of 1 element). The algorithm then takes 1 operation to “read” through the count array.

The number of operations here, is O(2+1), better written as O(N+1). Remember that in this case, all the given values (integers) in A, are negative.

The Code

The algorithm scans the given array, A to produce the count array. After that it “reads” through the count array, looking for any index from 1 to m (inclusive) that has 0 as number of occurrence. m is the maximum number in the given array, A. The first index found, beginning from the left of the count array (after index 0), is the missing smallest positive integer. If no such index is found, then the answer is m+1. The code is as follows (read through it and the comments):

#include <stdio.h> 

int solution(int A[], int N) {
    int count[N+1];
    for (int i=0; i < N+1; i++)
        count[i] = 0;
    
    for (int i=0; i< N; i++){
        if ( A[i] <= N && A[i] >=0) {  //considering only elements in A, from 0 to m (inclusive)
            count[A[i]] += 1;   
        }
    }
    
    for (int i=1; i<=N; i++){  //missing integer is less than m
        if (count[i]==0)
            return i;
    }

    return N+1;  //if A consists of elements from 1 to m
}

int main() 
{ 
    int P[] = {1, 3, 6, 4, 1, 2};
    int N = sizeof(P) / sizeof(P[0]);    //divide size of array by size of one (first) element
    int ret1 = solution(P, N);
    printf("%i\n", ret1);

    int Q[] = {1, 2, 3};
    N = sizeof(Q) / sizeof(Q[0]);    //divide size of array by size of one (first) element
    int ret2 = solution(Q, N);
    printf("%i\n", ret2);

    int R[] = {-1, -3};
    N = sizeof(R) / sizeof(R[0]);    //divide size of array by size of one (first) element
    int ret3 = solution(R, N);
    printf("%i\n", ret3);

    int S[] = {2};
    N = sizeof(S) / sizeof(S[0]);    //divide size of array by size of one (first) element
    int ret4 = solution(S, N);
    printf("%i\n", ret4);

    return 0; 
} 

The output is:

    5
    4
    1
    1

The very last return statement, “return N+1;” will be executed, if all the numbers in A are from 1 to m (inclusive). This means that duplicates are allowed. If the missing smallest positive integer is within the range, 1 to m (inclusive), then the previous return statement, “return i;” will return the index from the count array. There are only two return statements in the solution() function above.

The code has been presented in such a way that, in the case of only negative numbers, the number of occurrences of 1 will be 0.

The result from app.codility.com/programmers is:

    Task score : 100% -  Correctness : 100% ; Performance : 100%
    Detected time complexity : O(N) or O(N.log(N))

This does not include the time to initialize all the elements of the count array to 0, just after creation. In the author’s opinion, the time complexity should be O(N+N).

Conclusion

The efficient algorithm scans the given array, A in N time, to produce the count array. After that it “reads” through the count array, in a maximum of an additional N time, looking for any index from 1 to m (inclusive) that has 0 as number of occurrence. m is the maximum number in the given array, A. The first index found, beginning from the left (after index 0) of the count array, is the missing smallest positive integer. If no such index is found, then the answer is m+1.

This efficient algorithm is applicable to the case, where array, A has both positive and negative integers; as long as the count array is produced beginning from index 0. The 0 may still be a convenience 0.





Related Links

More Related Links

Cousins

BACK

Comments