ASCII
ASCII hold mapping between numbers and human-readable characters.
Do you remember the Index term from the previous post? Now it’s time to show you a real-world example of how useful indexing can be.
Smart Usage of Numbers
At last, it’s time for some real action. Imagine two people, Anna and Mark, who have been given an important task:
to send a text message over a network. Since this involves communication between two ends, Anna and Mark are on one
end, while Joe and Jane are on the other. But how do they send the word HELLO? Can’t computers only speak in numbers?
Yes, they can. But we, humans, are more advanced than that.
This problem was not solved by one vendor inventing a private table and calling it a day. Bob Bemer helped push the work in 1961, and the standardization work led to ASCII in the early 1960s. On this list, each numeric Index is mapped to a character that is easily understandable by humans. This system was a huge step forward in standardizing data processing and communication between computers from different vendors. Let’s explore this list further.
Take a look at the ASCII table below. It contains all the necessary characters of the English alphabet. In the early 1960s, when computing was still in its infancy, this was more than enough. Today, we know how to store and process data in a unified manner using computers.
Please note that the first element in ASCII has an Index/Position equal to 0. The printable part does not start
there, though. Printable ASCII starts with space at index 32, then ! at 33, and ends with ~ at 126. The values
from 0 to 31 are control codes, and 127 is DEL. They are not normal printable characters.
Helping Anna and Mark
Now, the guys need our help. Can you guess how we can send this message through the wire? I’m sure you can - it’s no
big deal for you. What you need to do is look up the ASCII table for the four characters: H, E, L, and O.
Don’t be confused by the different numbering systems in the ASCII table. They’re there to make your life easier. Have
you noticed how often I mention making life easy? Yes, programmers are lazy by nature, and they don’t like repetitive
tasks. That’s why many genius solutions exist.
Back to our message. If you look up the ASCII table, at Index = 72, you’ll get the letter H, which is the first
letter of the message you’re about to send. Let’s find the rest of the numbers that map to the message.
| Character | Index | BIN |
|---|---|---|
H | 72 | 01001000 |
E | 69 | 01000101 |
L | 76 | 01001100 |
L | 76 | 01001100 |
O | 79 | 01001111 |
At last, the time has come for some fun. Below is an ASCII Messenger, an interactive transmitter and receiver. To send a character, just type the corresponding number and press the Send button on the transmitter panel. When done, it will appear on the receiver’s display. Try to help Anna and Mark transmit the message.
It was fun, right? Great job helping the guys. Now you have an idea of how we can send messages that are understandable by both humans and computers.
7 Bits of Surprise
Have you noticed that the largest ASCII value is 127? That number is the largest value you can represent with seven bits. That is why ASCII is called a 7-bit character encoding.
Modern systems usually store ASCII text in 8-bit bytes anyway. The extra bit was historically used in different ways, for example as a parity bit or by later “extended ASCII” encodings. So no, we do not magically save one bit in every modern byte. The important lesson is simpler: ASCII only defines 128 values.
More on Bytes
Our message in DEC is a sequence of five numbers: 72, 69, 76, 76, 79. Each number fits inside one byte, which has 8 bits. The order in which they are presented reflects the order of the letters in the message.
Bigger Numbers
So far, you’ve learned how to process values ranging from 0 to 127, but the needs are growing. We would like to map diacritics and letters from non-English alphabets so that the rest of the world can accept it as a standard. For this, we need more than 128 positions. But how do we communicate with older machines that only support the ASCII table?
Words
If more space is needed for representation, one common trick is to use more than one byte. A group of bytes can represent a larger number than a single byte can. Different processors and protocols use different terms for such groups, so do not treat WORD as one universal size. In this lesson, we only need the idea that multiple bytes can represent one larger value.
From now on, next to DEC and BIN, I’ll also use HEX notation, which is great for dealing with bytes. Below is a table representing the same values with three numbering systems. You’re already familiar with these.
| BIN | HEX | DEC |
|---|---|---|
| 01001000 | 0x48 | 72 |
| 01000101 | 0x45 | 69 |
| 01001100 | 0x4C | 76 |
| 01001100 | 0x4C | 76 |
| 01001111 | 0x4F | 79 |
Hex gives us way more options for mapping than DEC. Thanks to this, much bigger numbers can be processed, meaning they can be stored, retrieved, or computed. Take a look at the HEX number sequences. They are placed in the same order as their BIN equivalents, so reading them is straightforward.
When a larger value uses multiple bytes, byte order may vary. This direction is called Endianness.
Endianness
The ordering of bytes in a multi-byte value is called Endianness. This term was coined by Danny Cohen, a computer scientist, and introduced in 1980. Our value 0x1234 contains two bytes: 0x12 and 0x34. If the byte order is Big Endian, 0x12 is the Most Significant Byte (MSB), which is the first element of the sequence.
On the other hand, if the Little Endian ordering is used, 0x34 becomes the Least Significant Byte (LSB), and it goes to the last position in the sequence. Automatically, 0x12 becomes the Most Significant Byte.
| Ordering | MSB | LSB |
|---|---|---|
| Big Endian | 0x12 | 0x34 |
| Little Endian | 0x34 | 0x12 |
Summary
Now you know that ASCII maps numbers to characters, and that multi-byte values can have different byte ordering. You also have an understanding of how we prepare bits to process human-readable messages like “HELLO”.