Broad Network


5.2 Find first non-repeating character of given string.

5. Basic Visited-array Algorithm Problems in C

Full Course on Data Structures and Algorithms in C

By: Chrysanthus Date Published: 3 Feb 2026

The reader is advised to read all the lessons (tutorials) in this full course, in the order presented.

Illustration

Reading from the left and in the following string,

    "The quick brown fox jumps over the lazy dog."

'q' is the first non-repeating character. The first character, t (T) in the given array (string), repeats two times. The second character, h repeats two times. The third character, e repeats three times. The fourth character, space repeats seven times. However, the space character is not considered in this problem.

'q' first appears in the fifth position of the given string (array) and does not appear anywhere again in the string. For the rest of the given string, there are characters that repeat. There are a few characters like b, that appear only once, in the rest of the string. However, only the first of such characters, which is q in this case, is asked for.

Making Uppercase and Lowercase of Same Letter, Same

There are 26 letters of the English alphabet. In this context, 'a' means A, b means B, c means C. etc. In the program below, the letters A to Z are given the integer values (codes) from 0 to 25 as in the following explanation:

In the ASCII character set and corresponding coding, the whole numbers that represent A-to-Z, occur consecutively. Similarly, in the ASCII character set, whole numbers that represent a-to-z, which are different from those of A-to-Z, also occur consecutively.

In the ASCII character set, if the ASCII whole number for A is subtracted from itself, 0 will result; if the ASCII whole number for A is subtracted from the ASCII whole number for B, 1 will result; if the ASCII whole number for A is subtracted from the ASCII whole number for C, 2 will result; and so on. Similarly, if the ASCII whole number for 'a' is subtracted from itself, 0 will result; if the ASCII whole number for 'a' is subtracted from the ASCII whole number for b, 1 will result; if the ASCII whole number for 'a' is subtracted from the ASCII whole number for c, 2 will result; and so on.

In this way, both the uppercase letters and lowercase letters, will become in the range, 0-to-25, accordingly, in the program below.

Visited Array

A visited array is an array whose indexes are all the possible whole number values of another array, the given array (or given string). This means that the given array might have fewer or more values (elements), with some elements repeating (even more than once). On the other hand, each of the element values of the visited array, is either 0 or above 0, depending on whether its index occurs at least once, in the given array.

So, in a given array (given string) of the alphabet, whose values (codes) range from 0 to 25 (26 in total), the indexes of the visited array range from 0 to 25, ending at 25 (zero based indexing), while the values of the given array are mixed and do not really follow any order, and they may be less or more than 25. If they are more than 25, it means some of the values repeat. Any value in the given array is an index in the visited array. Any value that occurs at least once in the given array, is represented by its number of occurrences (through out the given array), in the visited array, in the element for the visited array's index that is the value in the given array. The absence of a visited array index in the given array (given string), has its value as 0, in the visited array.

The visited array has a fixed size, whose indexes are all the possible whole numbers that can be a value (a character) in the given array. The given array can be shorter or same or longer than the visited array. If it is shorter, it means some indexes of the visited array are absent as values, in the given array. If it is the same, it might mean that some indexes of the visited array are absent as values in the given array and some indexes of the visited array repeat as values, in the given array; or each index in the visited array occurs only once in the given array. If it is longer, it might mean that some indexes of the visited array are absent as values in the given array, and some indexes of the visited array repeat as values, in the given array; or each index in the visited array is present at least once, in the given array, as value.

So, for the above given array (string sentence), the visited array will have all the 26 letters of the alphabet as indexes, from 0 to 25. The values of the alphabet will also appear to be from 0 to 25 in the given array (but disordered ). The length of the given array may be shorter, or same, or longer than that of the visited array. Do not forget that an uppercase letter and its corresponding lowercase letter have the same integer whole number code, for the program below.

A visited array like the one described above, is also called a frequency count array, because, after one run (scan) through the given array, the visited array has the number of occurrences of each element, in the given array.

Problem

Find the first non-repeating character of a given string.

Employ O(m+n) time and O(n+m) space.

Brute Force (naive) Approach

Strategy

The idea is to use two nested loops, the outer loop for picking a character and the inner loop for finding another occurrence of the picked character in the given string. As soon as a character which has only one occurrence in the input string, is seen, return it. If all characters have multiple occurrences, return null ('\0').

This involves O(n2) time and O(n) space, where n is the number of elements in the given string (array).

It is possible to have O(m+n) time, which is much shorter than O(n2) time. So, the brute force approach is not addressed any further, in this tutorial.

Smart Approach

Strategy

First create an array called, visited of length 26 and all its values initialized to 0. Next, iterate over the given array (given string). For any element encountered in the given string, increase its corresponding index value in the visited array by one (for number of occurrences). If an element occurs more than once in the given array, its same index element will have a value greater than 1, in the visited array. If an element does not occur at all in the given array, its indexed value in the visited array will remain at 0. If an element occurs once in the given string, its indexed value in the visited array will be 1. It is the first of such elements in the given array that is required (asked for).

So the given array (string) is scanned once, the first time, while not all the indexes of the visited array might be visited from the given array, or all might be visited once; or some might be visited more than once, with or without addressing all the indexes (of the visited array).

Finally, iterate over the given array (given string) again, not the visited array, while checking its indexed value in the visited array in O(1) time. The first character in the given array having a corresponding indexed value of 1 in the visited array, is the required character (character asked for).

The Smart Program

The following program has the smart algorithm (read through the code and comments):

#include <stdio.h>

const int MAX_CHAR = 26;
char nonRep(char s[], int n) {
    //create and initialize visited
    int visited[MAX_CHAR];
    for (int i=0; i < MAX_CHAR; i++)
        visited[i] = 0;

    for (int i = 0; i < n; i++) {
        //generate whole number codes from 0 to 25 and address visited array.
        // for uppercases
        if ('A' <= s[i] && s[i] <= 'Z') 
            visited[s[i] - 'A'] = visited[s[i] - 'A']  + 1; 
        // for lowercases
        else if ('a' <= s[i] && s[i] <= 'z') 
            visited[s[i] - 'a'] = visited[s[i] - 'a']  + 1;
    }

    //Look for first char in given array with value of 1 in visited array
    //consider 'A' or 'a' accordingly
    for (int i = 0; i < n; i++) {
        if ('A' <= s[i] && s[i] <= 'Z') 
            if (visited[s[i] - 'A'] == 1)  
                return s[i];
        if ('a' <= s[i] && s[i] <= 'z') 
            if (visited[s[i] - 'a'] == 1)  
                return s[i];
    }

    return '\0';
}

int main(int argc, char **argv) {
    int N = 0;    //length (size) of string to be determined in function below
    char stri[] = "The quick brown fox jumps over the lazy dog.";    //given string
    
    char *ptr = stri;
    while (*ptr != '\0') {    //in C a string ends with '\0', which should not be counted
        ptr++; 
        N += 1;
    }

    char ret = nonRep(stri, N);
    
    if (ret != '\0')
        printf("First non-repeating character is: %c.\n", ret);  
    else
        printf("Given string length is 0 or all chars repeat.\n");  
        
    return 0;
}

The output is:

    First non-repeating character is: q.

The complexities are actually O(m+3n) time and O(m+n) space, where m is the length of the visited array and n is the length of the given string. m time is for the creation of the visited array and its initialization. 3n time is for the determination of the length of the given input array (string) and for its two scan iterations (though the second may not complete) in the called function.

The complexities are officially given as O(m+n) time and O(m+n) space, since coefficients (multipliers of 2 or 3 or 1/2, etc.) are normally omitted.

Thanks for reading.





Related Links

More Related Links

Cousins

BACK NEXT

Comments


Note: You can use the Search Box above to find articles and discussions of interest.