Broad Network


Perl Predefined Functions for Scalars and Strings

Commonly Used Perl Predefined Functions – Part 1

Perl Course

Foreword: In this part of the series, I talk about Perl Predefined Functions for Scalars and Strings that are commonly used.

By: Chrysanthus Date Published: 19 Oct 2015

Introduction

This is part 1 of my series, Commonly Used Perl Predefined Functions. In this part of the series, I talk bout Perl Predefined Functions for Scalars and Strings that are commonly used.

Pre-Knowledge
This is part of the volume, Perl Course. At the bottom of this page, you will find links to the different series you should have read before coming here.

Scalar
All data in Perl is a scalar, an array of scalars, or a hash of scalars. A scalar may contain one single value in any of three different flavors: a number, a string, or a reference. Although a scalar may not directly hold multiple values, it may contain a reference to an array or hash, which in turn contains multiple values.

Character Literal
In Perl, the character literal can be typed with single or double quotes, e.g. 'A' for single quotes or "A" for double quotes. However, it is conventional in programming to type character literals in single quotes. That is what I recommend you do in Perl (for characters).

String Literals
A string literal can be typed in single quotes as in:

    'one two three'

A string literal can also be typed in double quotes as in:

   "four five six"

A number typed within a string behaves as a string character and no longer as a number (which can be added to another number). A variable within a double quoted string expands (is replaced by its value). A variable within a single quoted string does not expand (is not replaced by its value). An escape sequence such as \n within a double quoted string has its effect (e.g. \n would send the rest of the text on its right to the next line).

The chop Function
The chop function is used to remove the last character of a string or the last character of each string in an array of strings. It can also be used to remove the last character of each value in a hash, but not the keys. The syntaxes are:

    chop VARIABLE

    chop( LIST )
    chop

If the argument is a string variable or string literal, the last character is removed and returned. In the case of a list, the value of the last element character is returned. At the end, the original string or array or hash is modified. Before we continue, remember, parentheses around function arguments are optional in Perl. Try the following code:

use strict;

    my $str = "Well, I am there";
    my $ret = chop($str);
    print $ret, "\n";
    print $str, "\n";

For the string of interest, e is removed and returned. The string, $str becomes short of e.

If there is no argument, the string of the special variable, $_is chopped. The no-argument action corresponds to the last syntax above.

If the argument is an array of strings, the last character of each string is removed. However, here, it is the last character of the value of the last element that is returned. Read and try the following code:

use strict;

    my @arr = ("one", "two", "three", "four");
    my $ret = chop (@arr);
    print $ret, "\n";
    print @arr, "\n";

The chop operation here, corresponds to the second syntax above.

The chomp Function
This function is used to remove the \n escape sequence at the end of a line (record) from a text file saved in disk. Try the following code:

use strict;

    my $str = "This is the man.\n";
    my $ret = chomp($str);
    print $ret, "\n";
    print $str, "\n";

The syntaxes for the chomp function are:

    chomp VARIABLE

    chomp( LIST )
    chomp

The above code corresponds to the first syntax where the argument is a variable. In the absence of the variable, the $_ special variable is used; and that corresponds to the third syntax.

You can have an array of text lines from a file where each line ends in \n. The second syntax above will remove the \n from the end of all the string elements (lines) of the array. Read and try the following code:

use strict;

    my @arr = ("This is the man.\n", "This is the woman.\n", "This is the child.\n", "This is the boy.\n");
    my $ret = chomp(@arr);
    print $ret, "\n";
    print @arr, "\n";

The return value is the number of chomping (chopping). This chomp operation corresponds to the second syntax above.

String Length
The Perl length() function returns the number of bytes (characters) in a string. Try the following code:

use strict;

    my $str = "This is a string. Ha ha ha";
    my $len = length($str);
    print $len;

The output is 26. If you count the number of characters in the string, the number, 26 will be confirmed.

The index Function
Before I continue, remember, index counting begins from 0 and not 1. So, for the string:

    "abcdefghijklmopq"

a is at index zero, b is at index 1, c is at index 3 and so on.

Assume that you want to know the index at which the sub-string, "ghijk" starts (is found) in the above string, after index 2 (for c), you would type:

    index($str, $substr, 2)

where $str has the main string, $substr has the sub string and 2 is the index position at which to start the search. Try the following code:

use strict;

    my $str = "abcdefghijklmopq";
    my $substr = "ghijk";
    my $indx = index($str, $substr, 2);
    print $indx;

The output is 6.

Note: if the sub-string is not found in the main string, the index function returns –1.

If you want the search to begin from the beginning of the string, you just omit the last argument (2) in the index function. In this case, the index function becomes:

    index($str, $substr);

You can also use this function to find the index of a character. Read and try the following code:

use strict;

    my $str = "abcdefghijklmopq";
    my $substr = 'k';
    my $indx = index($str, $substr);
    print $indx;

The output is 10.

If there are multiple occurrences of the character (or sub-string), then the first occurrence is the one found. Try the following code for the space character:

use strict;

    my $str = "a b c d e f g h i j k l m o p q";
    my $substr = ' ';
    my $indx = index($str, $substr);
    print $indx;

The output is 1, because the first occurrence of the space character is at index 1 (index counting begins from 0).

Now, here are the syntaxes for the index function from the Perl specification:

    index STR,SUBSTR,POSITION

    index STR,SUBSTR

The rindex Function
The rindex function is the opposite of the index function in the sense that it searches for the last occurrence of the sub-string in the main string. The index function searches for the first occurrence of the sub-string. rindex works just like index() except that it returns the position of the last occurrence of SUBSTR in STR. If POSITION is specified, returns the last occurrence beginning at or before that position. The syntaxes for the rindex function are:

    rindex STR,SUBSTR,POSITION

    rindex STR,SUBSTR

Read and try the following code for the rindex function for the space character:

use strict;

    my $str = "a b c d e f g h i j k l m o p q";
    my $substr = ' ';
    my $indx = rindex($str, $substr);
    print $indx;

The output is 29, because the last space is at index position 29.

The substr Function
This function is used to extract a portion of a string. The extracted portion can be replaced. It returns the extracted portion. If the extracted portion is not replaced the original string remains unchanged. This function has three syntaxes. One of them is:

    substr EXPR,OFFSET

where EXPR is the main string to extract a portion from and OFFSET is the index in the string where the extraction starts. Remember, index counting in a string begins from zero. Read and try the following code:

use strict;

    my $str = "one two three four five";
    my $ret = substr($str, 14);
    print $ret, "\n";
    print $str, "\n";

With the above syntax, all the characters from the offset index to the end of the string are extracted.

It is possible to extract from the offset point to a point before the end of the string. You achieve this by indicating the length of sub-string to be extracted, in number of characters, as in the following syntax:

    substr EXPR,OFFSET,LENGTH

Read and try the following code:

use strict;

    my $str = "one two three four five";
    my $ret = substr($str, 14, 5);
    print $ret, "\n";
    print $str, "\n";

Remember, a space is a character, so the five characters extracted above are ‘f’, ‘o’, ‘u’, ‘r’ and ‘ ’.

You can also replace the extracted sub-string using the following syntax:

    substr EXPR,OFFSET,LENGTH,REPLACEMENT

where REPLACEMENT is the new sub-string to replace the extracted sub-string. Try the following code:

use strict;

    my $str = "one two three four five";
    my $ret = substr($str, 14, 4, "ffff");
    print $ret, "\n";
    print $str, "\n";

The chr() Function
This function takes an ASCII code as argument and converts it into the corresponding character. It would also take a Unicode as argument and converts it into the corresponding character. The ASCII code for A is 65. The Unicode for a smiley face is 0x263a (your system may not respond well to Unicode). The syntaxes are,

    chr NUMBER
    chr

If number is omitted, the value of $_ is used.

Try the following code:

use strict;

    my $char1 = chr(65);
    my $char2 = chr(0x263a);

    print $char1, "\n";
    print $char2, "\n";

The following code prints all ASCII characters from 32 to 126;

use strict;

    print chr($_), " " foreach (32..126);

The ord() Function
This function does the reverse of chr().The syntaxes are:

    ord EXPR
    ord

It returns the numeric value (code) of the first character of EXPR. If EXPR is an empty string, returns 0. If EXPR is omitted, uses $_ .

Try the following code:

use strict;

    my $num = ord("Booking");
    print $num;


The output is:

    66

A string as argument should be in quotes.

Changing Character Case

The \l Escape Sequence
The \l operator changes the next character in a string to lowercase. For example, assume that you have the string,

    "This is the Man in charge."

You can change ‘M’ in “man” to lowercase to have the string,

    "This is the man in charge."

as in the following code:

use strict;

    my $str = "This is \lMan in charge.";
    print $str;

Because of the \l before M, the output is,

    This is the man in charge.

The lcfirst Function
This function changes the first character of a string to lower case. So the code segment,

    my $str = "This phone text msg is 2 tell yu that I will not do yr thing again.";
    my $strl = lcfirst($str);
    print $strl;


will output,

    “this phone text msg is 2 tell yu that I will not do yr thing again.”.

Note how the function has been used in the code: all the string is the argument to the function.

The \u Operator
The \u operator changes the next character in a string to uppercase. For example, assume that you have the string,

    "The queen of england."

You can change ‘e’ in “England” to uppercase to have the string,

    "The queen of England."

as in the following code:

use strict;

    my $str = "The queen of \uengland.";
    print $str;

Because of the \u before the e, the output is,

    The queen of England.

The ucfirst Function
This function changes the first character of a string to uppercase. So the code segment,

    my $str = ucfirst("this is a sentence.");
    print $str;

will output,

    This is a sentence.

Note how the function has been used in the code: all the string is the argument to the function.

The \E Indicator
The \E indicator is an escape sequence that can be placed anywhere within a string. With the indicator present, an operator can act on all the characters from the beginning of the string to the indicator. So, the indicator indicates at which point in the string the operator should stop its changing action.

The \L Operator
This operator changes all characters from its point of embedding in the string to the indicator, \E if the indicator is present or to the end of the string if the indicator is absent, to lowercase. Any character in the stretch that was already in lowercase, remains in lowercase. Read and try the following code:

use strict;

    my $str = "this \LSENTENCE NEEDS \Ecorrection.";
    print $str;

Note that \L is placed in the string where you want the change to start. The output is:

    “this sentence needs correction.”

The lc Function
This function changes all the characters in a string to lowercase. The characters that were already in lowercase remain in lowercase. Try the following code:

use strict;
    my $str = lc("THIS IS HIS THING.");
    print $str;

The output is:

    this is his thing.

The \U operator
This operator changes all characters from its point of embedding in the string to the indicator, \E if the indicator is present or to the end of the string if the indicator is absent, to uppercase. Any character in the stretch that was already in uppercase, remains in uppercase. Read and try the following code:

use strict;

    my $str = "The countries: \Uusa, uk\E are important.";
    print $str;

Note that \L is placed in the string where you want the change to start. The output is:

    The countries: USA, UK are important.

The uc Function
This function changes all the characters in a string to uppercase. The characters that were already in uppercase remain in uppercase. Try the following code:

use strict;

    my $str = uc("usa, uno, unesco, ussr, uk");
    print $str;

The output is:

    USA, UNO, UNESCO, USSR, UK

The oct Function
The oct function takes an octal number as argument and returns the decimal equivalent. Read and try the following code:

use strict;

    my $str = "013";
    my $ret = oct($str);
    print $ret, "\n";

In the code, the octal number is 13 preceded by zero and the decimal equivalent output is 11. If you do not want to type the octal number within quotes, you have to omit the preceding zero. Read and try the following code:

use strict;

    my $str = 13;
    my $ret = oct($str);
    print $ret, "\n";

If the oct function has no argument then the number in the special variable, $_ is used as the octal number.

The hex Function
The hex function takes a hexadecimal number as argument and returns the decimal equivalent. Read and try the following code:

use strict;

    my $str = "0x13";
    my $ret = hex($str);
    print $ret, "\n";

In the code, the hexadecimal number is 13 preceded by zero and x, and the decimal equivalent output is 19. If you do not want to type the decimal number within quotes, you can omit the preceding 0x. Read and try the following:

use strict;

    my $str = 13;
    my $ret = hex($str);
    print $ret, "\n";

If the hex function has no argument then the number in the special variable, $_ is used as the hexadecimal number.

Well, we really must take a break here. We continue in the next part of the series.

Chrys

Related Links

Perl Basics
Perl Data Types
Perl Syntax
Perl References Optimized
Handling Files and Directories in Perl
Perl Function
Perl Package
Perl Object Oriented Programming
Perl Regular Expressions
Perl Operators
Perl Core Number Basics and Testing
Commonly Used Perl Predefined Functions
Line Oriented Operator and Here-doc
Handling Strings in Perl
Using Perl Arrays
Using Perl Hashes
Perl Multi-Dimensional Array
Date and Time in Perl
Perl Scoping
Namespace in Perl
Perl Eval Function
Writing a Perl Command Line Tool
Perl Insecurities and Prevention
Sending Email with Perl
Advanced Course
Miscellaneous Features in Perl
Perl Two-Dimensional Structures
Advanced Perl Regular Expressions
Designing and Using a Perl Module
More Related Links
Perl Mailsend
PurePerl MySQL API
Perl Course - Professional and Advanced
Major in Website Design
Web Development Course
Producing a Pure Perl Library
MySQL Course

NEXT

Comments

Become the Writer's Fan
Send the Writer a Message