Skip to main content
Loading time...

What is ASCII?

The story of the encoding standard that defined how computers represent text, from telegraph codes to the foundation of modern character encoding.

Definition

ASCII (American Standard Code for Information Interchange) is a character encoding standard that assigns numeric codes to 128 characters: 33 control characters (codes 0-31 and 127), a space character (code 32), and 94 printable characters including uppercase and lowercase English letters, digits 0-9, and common punctuation marks.

Each ASCII character fits in 7 bits, meaning it can be stored in a single byte with one bit to spare. The letter "A" is code 65 (binary 1000001), the digit "0" is code 48 (binary 0110000), and the space character is code 32 (binary 0100000).

A Brief History

ASCII has its roots in telegraph communication. Before computers, telegraph systems used codes like Baudot and Murray to transmit text over wires. When computers emerged in the 1950s and 1960s, dozens of incompatible character codes were in use, making data exchange between systems nearly impossible.

In 1960, the American Standards Association (now ANSI) began work on a unified code. The first edition of ASCII was published in 1963, and the standard was revised significantly in 1967 to add lowercase letters and refine the control characters. The 1967 revision (ANSI X3.4-1967) is essentially the ASCII we use today.

By the 1970s, ASCII had become dominant in the United States and much of the English-speaking world. Its adoption was cemented by its use in Unix, the C programming language, and the early Internet (ARPANET).

The ASCII Table Structure

The 128 characters are organized deliberately:

Control Characters (0-31, 127)

The first 32 codes and code 127 (DEL) are control characters. They do not represent visible symbols but instead control how text is processed. Many were designed for teletype machines and serial communication:

  • NUL (0): Null character, used as a string terminator in C
  • BEL (7): Rings the terminal bell
  • BS (8): Backspace
  • TAB (9): Horizontal tab
  • LF (10): Line feed (Unix newline)
  • CR (13): Carriage return (used with LF on Windows: CR+LF)
  • ESC (27): Escape, used to start terminal escape sequences
  • DEL (127): Delete, originally used to "punch out" errors on paper tape

Printable Characters (32-126)

The 95 printable characters include space (32), digits 0-9 (48-57), uppercase A-Z (65-90), lowercase a-z (97-122), and punctuation. The arrangement was intentional: uppercase and lowercase letters differ by exactly one bit (bit 5), making case conversion a simple bitwise operation.

// Case conversion via bit manipulation
'A' = 0b1000001 (65)
'a' = 0b1100001 (97)
// Toggle bit 5 (value 32) to switch case

char upper = c & 0b1011111;  // Clear bit 5 -> uppercase
char lower = c | 0b0100000;  // Set bit 5   -> lowercase

Clever Design Choices

The designers made several decisions that simplified computing for decades:

  • Digits 0-9 at codes 48-57: The lower 4 bits of each digit code equal its numeric value ('5' & 0x0F = 5)
  • Alphabetical order: Letters are in sequential order, so 'A' < 'B' < 'C' works for sorting
  • Space at 32: Lower than all printable characters, so strings sort naturally with spaces first

ASCII in Modern Computing

Despite being over 60 years old, ASCII remains foundational. Every modern encoding (UTF-8, UTF-16, ISO 8859) is backward-compatible with ASCII. The first 128 codepoints of Unicode are identical to ASCII. This means any valid ASCII text is also valid UTF-8.

ASCII is still used directly in:

  • Programming languages: Identifiers, keywords, and operators are ASCII in virtually every language
  • Network protocols: HTTP headers, SMTP, DNS, and FTP use ASCII for commands and metadata
  • File formats: CSV, JSON, XML, and YAML use ASCII delimiters and structure
  • Configuration: Environment variables, file paths, and command-line arguments are typically ASCII

Limitations

ASCII's 128-character set covers only English. It has no accented characters (é, ü, ñ), no CJK ideographs, no Arabic or Hebrew scripts, and no emoji. These limitations led to "extended ASCII" variants (ISO 8859 series, Windows-1252) in the 1980s and ultimately to Unicode in the 1990s.

Learn more about how Unicode solved ASCII's limitations in our guide on Unicode vs. ASCII.

Try It Yourself

Explore the complete ASCII table interactively with our ASCII Table & Unicode Explorer. Click any character to see its decimal, hex, binary, and HTML representations.

Further Reading