Broad Network


White Space in Perl

Perl Basics – Part 16

Perl Course

Foreword: In this part of the series, I explain what is whitespace and how it is used in Perl.

By: Chrysanthus Date Published: 29 Mar 2015

Introduction

This is part 16 of my series, Perl Basics. In this part of the series, I explain what is whitespace and how it is used in Perl. When an ordinary man looks at a printed document or a web page, he can call the spaces that do not have text or pictures, white space. These spaces are actually blank spaces; they do not have to be white in color.  On a computer screen, these spaces have characters and a programmer has to be conscious of which character creates a particular type of white space. The meaning of these characters and their particular type of white spaces are listed and explained in this article. You should have read the previous parts of the series before reaching here, as this is a continuation.

Type of Characters
The ordinary man considers, A, as a character, B, as another character, C, as another character, 5 as a character, 7 as character, and so on. On the computer keyboard, you see non-commonly used characters such as the asterisk, *. In many computer languages, a white space character consists of two items. It begins with a backslash followed by text. These two items effectively form the white space character. For example, \n, is a white space character for a good number of computer languages; it has been mentioned before. These two-item characters are better called Escape Sequences. They are also called Special Characters.

The Horizontal Tab
While somebody is writing with a pen on a piece of paper, if he wants to start a new paragraph he does not start at the left margin; he shifts a bit to the right (indents). That indentation can be considered as a horizontal tab. There is an escape sequence that can be used to achieve this with Perl. It is, \t. It begins with a backslash, followed by ‘t’ in lower case. This special character is called the horizontal tab (i.e. \t). Try the following code:

use strict;

print "\tAnd the sentence begins";

At the output you should see a horizontal space in front of the text. The horizontal tab character can be placed anywhere in the string and you can have more than one of them in a string. You can use the horizontal tab character (\t) to format output of a lot of text that is in table format. However, there are better ways of formatting output.

The Space
While typing in a text editor, when the computer user presses the spacebar of the keyboard, a space is created on the screen. In Perl, an escape sequence (space) for this space character is, \ . The two items involved are the backslash and ‘ ’. Try the following code:

use strict;

print "\tAnd the winner\ is Jane";

The backslash is not printed. However, the space is printed. You can still achieve the same effect by just pressing the spacebar key.

Form Feed
A form feed is more of an instruction than a blank space character. It is called a white space character because it can cause a blank space. Imagine that there are about ten lines of text for a document. Also imagine that in the middle of this text, you have the escape sequence, \f, which is what many languages use as the form feed character. Now, while the page that has this text is being printed (displayed), when the printer (or screen) reaches this escape sequence, it should not print the rest of the text below on the current page; it should advance the page, leaving a blank space and then starts printing the rest of the text on the next page (paper). Form Feed means: print the rest of the text on the next page, just after feeding in the next page (paper to the printer). If the printer meets this character at the end of the current page, then no blank space would be produced, as the rest of the text would be printed (or displayed) on the next page fed.

Line Terminators
Two escape sequences are described below as whitespace characters, but they do not really produce blank spaces. However, they affect where the next line or text would be printed or displayed.

The horizontal tab (\t), spacebar space and vertical tab (\v) whitespace characters are by themselves blank spaces. The form feed character can produce a blank space depending on its position in the current page; in itself, it is more of an instruction than a blank space character. The two escape sequences below, are not blank characters by themselves. They are actually line terminators, but in many forums they are called, white space characters.

Carriage Return
Imagine that a line of text is to be displayed (printed) and there is the escape sequence, \r in the middle of the line of text. \r is known as the Carriage Return character for many languages. When the printer or screen reaches this point, it sends the ink (or light) to the beginning of the current line. After this, if printing were to continue, the current line will be written over, by the right half of the line of text. The carriage return escaped sequence is normally used in conjunction with the Newline escape sequence (see below).

The Newline
Imagine that a line of text is to be displayed and there is the escape sequence, \n in the middle of the line of text. \n is known as the Newline character for many languages. When the printer or screen reaches this point, it sends the ink (or light) to the next (to-be-displayed) line. It is not clear whether the ink should go to the beginning or middle or end of the next line. If the programmer wants printing to continue at the beginning of the next line, then he has to use both \r and \n together (i.e. \r\n) at the same point in the line of text. With some languages (compilers or interpreters), \n alone serves the purpose of the presence of both \r and \n. All the escape sequences given in this tutorial should work with the console (monochrome text output).

Note
The escape sequences for the whitespaces are not displayed as \t, \f, etc. The user sees only their effects. The space, horizontal tab and vertical tab characters can be considered as pure white space characters.  The form feed and especially the line terminators, can be considered as indirect white space characters.

We can stop here for now. See you in the next part of the series.

Chrys

Related Links

Perl Basics
Perl Data Types
Perl Syntax
Perl References Optimized
Handling Files and Directories in Perl
Perl Function
Perl Package
Perl Object Oriented Programming
Perl Regular Expressions
Perl Operators
Perl Core Number Basics and Testing
Commonly Used Perl Predefined Functions
Line Oriented Operator and Here-doc
Handling Strings in Perl
Using Perl Arrays
Using Perl Hashes
Perl Multi-Dimensional Array
Date and Time in Perl
Perl Scoping
Namespace in Perl
Perl Eval Function
Writing a Perl Command Line Tool
Perl Insecurities and Prevention
Sending Email with Perl
Advanced Course
Miscellaneous Features in Perl
Perl Two-Dimensional Structures
Advanced Perl Regular Expressions
Designing and Using a Perl Module
More Related Links
Perl Mailsend
PurePerl MySQL API
Perl Course - Professional and Advanced
Major in Website Design
Web Development Course
Producing a Pure Perl Library
MySQL Course

BACK NEXT

Comments

Become the Writer's Fan
Send the Writer a Message