HTML Charsets

What is Character Encoding?

Character encoding (charset) is a system that maps characters to numbers that computers can understand. Different character encodings support different character sets. HTML5 uses UTF-8 encoding by default, which supports all characters from all languages and symbols.

The Charset Meta Tag

The charset meta tag specifies the character encoding for the HTML document. It must be placed in the <head> section:

Example
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>My Page</title>
</head>
<body>
    <p>Hello, World!</p>
</body>
</html>

UTF-8 Encoding (Recommended)

UTF-8 (Unicode Transformation Format - 8-bit) is the recommended encoding for HTML5. It supports all characters from all languages, symbols, and emojis:

Example
<head>
    <meta charset="UTF-8">
</head>
<body>
    <p>English: Hello, World!</p>
    <p>Spanish: ¡Hola, Mundo!</p>
    <p>French: Bonjour, le monde!</p>
    <p>Chinese: 你好世界</p>
    <p>Japanese: こんにちは世界</p>
    <p>Symbols: © ® € £ ¥</p>
    <p>Emojis: 😀 😊 ❤️ 🎉</p>
</body>

Other Character Encodings

While UTF-8 is recommended, HTML supports other encodings:

ASCII

ASCII (American Standard Code for Information Interchange) supports only English characters (128 characters):

Example - Deprecated
<!-- Don't use - Very limited character set -->
<meta charset="ASCII">

ISO-8859-1

ISO-8859-1 (Latin-1) supports Western European languages (256 characters):

Example - Not Recommended
<!-- Not recommended - Limited character set -->
<meta charset="ISO-8859-1">

Windows-1252

Windows-1252 is a Windows-specific encoding for Western European languages:

Example - Not Recommended
<!-- Not recommended - Windows-specific -->
<meta charset="Windows-1252">

HTML5 Default

In HTML5, UTF-8 is the default character encoding. However, it's best practice to always specify it explicitly:

Example
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">  <!-- Always include this -->
    <title>My Page</title>
</head>
</html>

HTML4 Charset Declaration

In HTML4, charset was declared differently (now deprecated):

Example - HTML4 (Deprecated)
<!-- HTML4 way - Don't use -->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Example - HTML5 (Recommended)
<!-- HTML5 way - Use this -->
<meta charset="UTF-8">

Why UTF-8 is Recommended

UTF-8 is recommended because:

Charset Placement

The charset meta tag should be placed as early as possible in the <head> section, ideally as the first element:

Example - Good (Recommended)
<head>
    <meta charset="UTF-8">  <!-- First element -->
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>My Page</title>
</head>

Best Practices

Server Configuration

Your web server should also send UTF-8 encoding in HTTP headers. Most modern servers do this by default, but you can verify with:

Example - HTTP Header
Content-Type: text/html; charset=UTF-8

Common Issues

Common charset-related issues: