Broad Network


Converting Bytes to Bit Characters and Vice Versa with Perl

Perl pack and unpack Functions – Part 4

Writing a Perl Module

Foreword: In this part of the series, I explain how to convert bytes to bit characters and vice-versa with Perl.

By: Chrysanthus Date Published: 27 Jan 2015

Introduction

This is part 4 of my series, Perl pack and unpack Functions. In this part of the series, I explain how to convert bytes to bit characters and vice-versa with Perl. You should have read the previous parts of the series before reaching here, as this is a continuation.

Template
All what was discussed on template in the first part of the series is applicable here, but instead of A, you use B or b. Now, with the B or b meta characters, consecutive bytes (characters) in the memory are not only separated (identified) or joined; but they are also converted to bit characters and vice-versa. For example, the byte, 01001110 would be converted to the characters, '0''1''0''0''1''1''1''0' and the characters; '0''1''0''0''1''1''1''0' would be converted to the byte, 01001110 .  Note that '0''1''0''0''1''1''1''0' is a string of characters, while 01001110 is a string of bits. In code, '0''1''0''0''1''1''1''0' would be typed as, "01001110"

In the conversion from byte to bit characters, the bytes in memory are not changed. Yes, the bytes are converted into bit characters, but these resulting bit characters are stored elsewhere in memory. The bit characters are displayed from the new storage area, leaving the original bytes unchanged.

Now, a string of bit characters, displayed and stored somewhere in memory, can be converted into bytes and stored elsewhere in memory. Note: each bit of a byte is displayed as a character, which is store somewhere as a byte in its own right; and conversion between a bit of a byte and a bit character can be made.

Bytes to Bits

Converting Bytes to Bit Characters
I start with an example. Consider the following program:

use strict;

    my $str_B = "I love";
    my $str_Bits = unpack('B48', $str_B);

    print $str_Bits;

The output is:

    010010010010000001101100011011110111011001100101

In hexadecimal characters, you would have got the output of, 49206c6f7665, where 49 would be for 01001001 and 20 would be for 00100000 and 6c for 01101100, etc.

Now, in the variable, $str_B there are 6 characters including the space character. Each of these characters is stored as a byte in memory. To convert consecutive bytes to displayable bit characters, use the unpack function. Each byte results in 8 bit characters, so for the 6 characters, the template has 48 after B (from 6 X 8).

Note that for the output, the space byte has been displayed correctly, as 00100000, which is 20 in hexadecimal. ‘I’ has been displayed in bit characters as 01001001. Lowercase ‘l’ has been displayed in bit characters as 01101100. Do not confuse between the bit character equivalent of a byte and the actual bits themselves. The converted bit characters are stored elsewhere in the memory and the original bytes of $str_B are not touched.

When you do not know the number of characters in the variable or in the memory region, use the * in the template, which means everything left, instead of a number like 48. The following statement illustrates this:

    my $str_Bits = unpack('B*', $str_B);

If the template is just 'B', only one bit of a byte will be unpacked to give one character. B means one bit character. B and B1 mean the same thing, but any other number means that number of bit characters.

The unpack function takes a variable as input and can return a list of values. The pack function can take a list as input and will return one value. So, the template can be modified to something like,

    'B8B8B32'

to return three values: one for B8, one for the second B8 and one for B32. The following code illustrates this:

use strict;

    my $str_B = "I love";
    my ($str_Bits0, $str_Bits1, $str_Bits2) = unpack('B8B8B32', $str_B);

    print $str_Bits0, "\n", $str_Bits1, "\n", $str_Bits2;

The output is:

01001001
00100000
01101100011011110111011001100101

where 01001001 is for ‘I’, 00100000 is for a single space and 01101100011011110111011001100101 is for the characters, 'l''o''v''e'.

Least Significant Bit First
The hexadecimal characters for the letter, ‘I’ is 49. The byte for the letter ‘I’ is 01001001. For this byte, it is the Most Significant Bit (MSB) that has been written first (on the left). The left and right corresponding bits in the byte can be swapped so that you have the Least Significant Bit (LSB) first. For LSB the byte is, 10010010. Perl gives you the choice of displaying MSB first or LSB first for a byte. For MSB, you use the template, B and for LSB, you use the template, b. Meta characters for the template is case sensitive.

The following code displays the bit characters of the bytes of the string, "I love". The bytes are displayed in the order in which the letter characters have been typed in the string, but each byte has been displayed with LSB first. Well, the code actually has a previous code segment as the second segment, so that you can make the output comparison.

use strict;

    my $str_B = "I love";
    my $str_Bits = unpack('b48', $str_B);
    print $str_Bits, "\n";


    my $str_BMSB = "I love";
    my $str_BitsMSB = unpack('B48', $str_BMSB);
    print $str_BitsMSB;

The output is:

10010010 00000100 00110110 11110110 01101110 10100110
01001001 00100000 01101100 01101111 01110110 01100101

So, if you want LSB for each byte, use b in the template instead of B. The byte order for the string remains the same but the bit order for each byte is swap.

Bits to Bytes

Conversion from Bit Characters to Bytes
I start with an example. Consider the following program:

use strict;

    my $str_Bits = "010010010010000001101100011011110111011001100101";
    my $str_B = pack('B48', $str_Bits);

    print $str_B;

The output is:

    I love

Now, in the variable, $str_Bits there are 48 characters with 8 consecutive characters for the space. Each of these characters is stored as a byte in memory, but that is not the focus here. To convert consecutive bit characters to bytes (the focus), use the pack function. This function contracts 8 consecutive bit characters to 8 bits forming a byte. Each byte is from 8 bit characters, so you have the number, 48 after B. The 48 bit characters will result in 6 bytes (48 / 8). A byte is printed as a character.

Note that for the output, the space byte has been displayed correctly, from 00100000, which is 20 in hexadecimal. ‘I’ has been displayed from 01001001. Lowercase ‘l’ has been displayed from 01101100; all characters displayed correctly. Do not confuse between the bit character equivalent of a byte and actual bits themselves. The bit characters, which are not the focus, are stored elsewhere in the memory while the bytes of interest are stored in memory as the value of the variable, $str_B.

When you do not know the number of characters in the variable, use * in the template, which means everything left, instead of a number like 48. The following statement illustrates this:

    my $str_B = pack('B*', $str_Bits);

If the template is just 'B', only one bit character will be packed to give one bit of a byte; this is a rather incomplete process because the computer works in bytes (groups of 8 bits) and hardly in bits. B means one bit character. B and B1 mean the same thing, but any other number means that number of bit characters.

The unpack function takes a variable as input and can return a list of values. The pack function can take a list as input and will return one value. So, the template can be modified to something like,

    'B8B8B32'

to join and convert three variables into one. The following code illustrates this:

use strict;

    my $str_Bits0 = "01001001";
    my $str_Bits1 = "00100000";
    my $str_Bits2 = "01101100011011110111011001100101";
    my  $str_B = pack('B8B8B32', ($str_Bits0, $str_Bits1, $str_Bits2));

    print $str_B;

The output is:

    I love

where 01001001 is for ‘I’, 00100000 is for a single space and 01101100011011110111011001100101 is for the characters, 'l''o''v''e'.

Least Significant Bit First
B is for MSB and b is for LSB. The b template meta character can be used for the same purpose with the pack function.

The following code displays, "I love" but from a sequence of bit characters, where each group of 8 characters is type, least significant bit first. The letters are displayed in the order in which the bytes (as characters) where typed. Well, the code actually has a previous code segment as the second segment, so that you can make the output comparison.

use strict;

    my $str_Bits = "100100100000010000110110111101100110111010100110";
    my $str_B = pack('b48', $str_Bits);
    print $str_B, "\n";


    my $str_BitsMSB = "010010010010000001101100011011110111011001100101";
    my $str_BMSB = pack('B48', $str_BitsMSB);
    print $str_BMSB;

The output is:

I love
I love

So, if you want LSB for each byte sub-string, use b in the template instead of B. The byte order for the string remains the same but the bit order for each byte is swapped. In this code, the bit order is swapped for the first code segment.

That is it for this part of the series. Let us take a break here and continue in the next part.

Chrys

Related Links

Internet Sockets and Perl
Perl pack and unpack Functions
Writing MySQL Protocol Packets in PurePerl
Developing a PurePerl MySQL API
Using the PurePerl MySQL API
Database
Perl Course
MySQL Course
More Related Links
Perl Mailsend
PurePerl MySQL API
Perl Course - Professional and Advanced
Major in Website Design
Web Development Course
Producing a Pure Perl Library
MySQL Course

BACK NEXT

Comments

Become the Writer's Fan
Send the Writer a Message