Broad Network


URL Encoding

HTML Character Sets - Part 4

Forward: In this part of my series, I talk about URL Encoding.

By: Chrysanthus Date Published: 31 Jul 2012

Introduction

This is part 4 of my series, HTML Character Sets. In this part of my series, I talk about URL Encoding.

Note: If you cannot see the code or if you think anything is missing (broken link, image absent, etc.), just contact me at forchatrans@yahoo.com. That is, contact me for the slightest problem you have about what you are reading.

Description
What is typed in the address bar of a browser is a URL. A URL can also be used as the value of the href attribute of an a element. The URL is used in a few other places in the web page. When you type an address in the address bar of the web browser, and click Go, the URL is sent across the Internet.

Form dataset is also sent across the Internet in a similar way. If the value of the form method attribute is "get", then the form information would be sent as a long URL, something like,

    http://fine-hosting.com?firstname=Juan+Mary&lastname=Jones

This URL sends the first name, “Juan Mary” and last name, “Jones” of a woman through the Internet to a server.

A URL is sent as an ASCII code (characters from the ASCII character set); any space is replaced by a + sign. All the characters in the above URL are in the ASCII character set. There are two problems: The characters such as, :, /, ?. = and & have special meanings in the URL. If these characters are in data area (e.g. name) of the URL, then they have to be coded. There are characters that are not in the ASCII character set, that can also be sent within the URL.  Because of these two problems there is another character set, which is a modified ASCII character set. It is called the URL encoding set; it is more of an encoding scheme.

In URL encoding; a space is + or %20, ? is %3F; a is %61; b is %62; c is %63. Now, special characters of the URL and non ASCII characters of the URL can be called, unsafe ASCII characters. So in a URL, unsafe ASCII characters must be coded; the normal ASCII characters can remain un-coded. A character code begins with % followed by 2 hexadecimal digits. The following tables list all the URL encoding characters and their encoding:.

URL Encoding Reference for Printable Characters

ASCII Character URL-encoding
space %20
! %21
" %22
# %23
$ %24
% %25
& %26
' %27
( %28
) %29
* %2A
+ %2B
, %2C
- %2D
. %2E
/ %2F
0 %30
1 %31
2 %32
3 %33
4 %34
5 %35
6 %36
7 %37
8 %38
9 %39
: %3A
; %3B
< %3C
= %3D
> %3E
? %3F
@ %40
A %41
B %42
C %43
D %44
E %45
F %46
G %47
H %48
I %49
J %4A
K %4B
L %4C
M %4D
N %4E
O %4F
P %50
Q %51
R %52
S %53
T %54
U %55
V %56
W %57
X %58
Y %59
Z %5A
[ %5B
\ %5C
] %5D
^ %5E
_ %5F
` %60
a %61
b %62
c %63
d %64
e %65
f %66
g %67
h %68
i %69
j %6A
k %6B
l %6C
m %6D
n %6E
o %6F
p %70
q %71
r %72
s %73
t %74
u %75
v %76
w %77
x %78
y %79
z %7A
{ %7B
| %7C
} %7D
~ %7E
  %7F
%80
  %81
%82
ƒ %83
%84
%85
%86
%87
ˆ %88
%89
Š %8A
%8B
Π%8C
  %8D
Ž %8E
  %8F
  %90
%91
%92
%93
%94
%95
%96
%97
˜ %98
%99
š %9A
%9B
œ %9C
  %9D
ž %9E
Ÿ %9F
  %A0
¡ %A1
¢ %A2
£ %A3
  %A4
¥ %A5
| %A6
§ %A7
¨ %A8
© %A9
ª %AA
« %AB
¬ %AC
¯ %AD
® %AE
¯ %AF
° %B0
± %B1
² %B2
³ %B3
´ %B4
µ %B5
%B6
· %B7
¸ %B8
¹ %B9
º %BA
» %BB
¼ %BC
½ %BD
¾ %BE
¿ %BF
À %C0
Á %C1
 %C2
à %C3
Ä %C4
Å %C5
Æ %C6
Ç %C7
È %C8
É %C9
Ê %CA
Ë %CB
Ì %CC
Í %CD
Î %CE
Ï %CF
Ð %D0
Ñ %D1
Ò %D2
Ó %D3
Ô %D4
Õ %D5
Ö %D6
  %D7
Ø %D8
Ù %D9
Ú %DA
Û %DB
Ü %DC
Ý %DD
Þ %DE
ß %DF
à %E0
á %E1
â %E2
ã %E3
ä %E4
å %E5
æ %E6
ç %E7
è %E8
é %E9
ê %EA
ë %EB
ì %EC
í %ED
î %EE
ï %EF
ð %F0
ñ %F1
ò %F2
ó %F3
ô %F4
õ %F5
ö %F6
÷ %F7
ø %F8
ù %F9
ú %FA
û %FB
ü %FC
ý %FD
þ %FE
ÿ %FF

URL Encoding Reference for Control Characters

ASCII Character Description URL-encoding
NUL null character %00
SOH start of header %01
STX start of text %02
ETX end of text %03
EOT end of transmission %04
ENQ enquiry %05
ACK acknowledge %06
BEL bell (ring) %07
BS backspace %08
HT horizontal tab %09
LF line feed %0A
VT vertical tab %0B
FF form feed %0C
CR carriage return %0D
SO shift out %0E
SI shift in %0F
DLE data link escape %10
DC1 device control 1 %11
DC2 device control 2 %12
DC3 device control 3 %13
DC4 device control 4 %14
NAK negative acknowledge %15
SYN synchronize %16
ETB end transmission block %17
CAN cancel %18
EM end of medium %19
SUB substitute %1A
ESC escape %1B
FS file separator %1C
GS group separator %1D
RS record separator %1E
US unit separator %1F

That is it for this part of the series. We stop here and continue in the next part.

Chrys

Related Links

Major in Website Design
Web Development Course
HTML Course
CSS Course
ECMAScript Course
NEXT

Comments

Become the Writer's Fan
Send the Writer a Message