Page 154 - Programming With Python 3

P. 154

An Introduction to STEM Programming with Python — 2019-09-03a Page 141
Chapter 12 — String Encoding

BIN 000 001 010 011 100 101 110 111
Free NUL DLE SP 0 @ Q a ` p
0
HEX
2
4
1
3
BIN
6
7
5
0000
P
0
!
q
0001
A
SOH
DC1
1
1
0010
B
S
3
ETX
C
c
0011 2 STX DC2 " # 2 D R b r s t
3
DC3
eBook % 5 E U e f u
4
T
d
DC4
$
EOT
0100
4
NAK
ENQ
5
0101
&
ACK
v
SYN
0110
F
6
6
V
0111
ETB
1000 7 BEL CAN ( ) ' 7 G W g i w
h
x
8
X
8
H
BS
Edition * : ; K Z [ k j z
9
Y
EM
9
I
y
1001
HT
J
LF
SUB
A
1010
VT
{
1011
B
ESC
+
1100
L
FF
l
=
1101 C CR FS , - < M \ ] m } |
GS
D
Please support this work at
N
n
^
.
~
>
1110
E
S0
RS
/
US
_
o
S1
O
?
F
1111
DEL
Table 9: ASCII Character Encoding Table
http://syw2l.org
Free
Unicode
In the late 1980s and early 1990s Xerox, Apple, Microsoft, and others begin working on a new way to
represent characters. The early idea was to widen the existing character set to 16 bits, to allow 65,536
code points. It was originally thought that this would cover the vast majority of characters in modern
languages. This technique was known as UCS-2 but was found to be too large and limiting. This
eBook
required creating a better and more flexible method for encoding characters, we call that UTF-8.
The Unicode Consortium is a collection of many of the largest companies in the tech world. Members
include: Apple, Oracle, IBM, Microsoft, Google and others. The Unicode specification is a living
document that is being revised on a regular basis. The Unicode 11.0 specification even defines code
points for 1644 emojis.
Edition
UTF-8
UTF-8 was initially specified in 1996, and by 2009 had become the dominant character encoding for
Copyright 2019 — James M. Reneau Ph.D. — http://www.syw2l.org — This work is licensed
under a Creative Commons Attribution-ShareAlike 4.0 International License.

149 150 151 152 153 154 155 156 157 158 159