HTML Charsets
What is Character Encoding?
Character encoding (charset) is a system that maps characters to numbers that computers can understand. Different character encodings support different character sets. HTML5 uses UTF-8 encoding by default, which supports all characters from all languages and symbols.
The Charset Meta Tag
The charset meta tag specifies the character encoding for the HTML document. It must be placed in the <head> section:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>My Page</title>
</head>
<body>
<p>Hello, World!</p>
</body>
</html>
UTF-8 Encoding (Recommended)
UTF-8 (Unicode Transformation Format - 8-bit) is the recommended encoding for HTML5. It supports all characters from all languages, symbols, and emojis:
<head>
<meta charset="UTF-8">
</head>
<body>
<p>English: Hello, World!</p>
<p>Spanish: ¡Hola, Mundo!</p>
<p>French: Bonjour, le monde!</p>
<p>Chinese: 你好世界</p>
<p>Japanese: こんにちは世界</p>
<p>Symbols: © ® € £ ¥</p>
<p>Emojis: 😀 😊 ❤️ 🎉</p>
</body>
Other Character Encodings
While UTF-8 is recommended, HTML supports other encodings:
ASCII
ASCII (American Standard Code for Information Interchange) supports only English characters (128 characters):
<!-- Don't use - Very limited character set -->
<meta charset="ASCII">
ISO-8859-1
ISO-8859-1 (Latin-1) supports Western European languages (256 characters):
<!-- Not recommended - Limited character set -->
<meta charset="ISO-8859-1">
Windows-1252
Windows-1252 is a Windows-specific encoding for Western European languages:
<!-- Not recommended - Windows-specific -->
<meta charset="Windows-1252">
HTML5 Default
In HTML5, UTF-8 is the default character encoding. However, it's best practice to always specify it explicitly:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8"> <!-- Always include this -->
<title>My Page</title>
</head>
</html>
HTML4 Charset Declaration
In HTML4, charset was declared differently (now deprecated):
<!-- HTML4 way - Don't use -->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<!-- HTML5 way - Use this -->
<meta charset="UTF-8">
Why UTF-8 is Recommended
UTF-8 is recommended because:
- Universal support: Supports all characters from all languages
- Emoji support: Supports emojis and special symbols
- Web standard: UTF-8 is the standard encoding for web content
- Backward compatible: ASCII characters are encoded the same in UTF-8
- Efficient: Variable-length encoding (1-4 bytes per character)
Charset Placement
The charset meta tag should be placed as early as possible in the <head> section, ideally as the first element:
<head>
<meta charset="UTF-8"> <!-- First element -->
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>My Page</title>
</head>
Best Practices
- Always use UTF-8: UTF-8 is the recommended encoding for HTML5
- Place charset first: Place charset meta tag as the first element in head
- Always declare charset: Even though UTF-8 is default, always declare it explicitly
- Save files as UTF-8: Save your HTML files with UTF-8 encoding
- Server configuration: Ensure your server sends UTF-8 in HTTP headers
Server Configuration
Your web server should also send UTF-8 encoding in HTTP headers. Most modern servers do this by default, but you can verify with:
Content-Type: text/html; charset=UTF-8
Common Issues
Common charset-related issues:
- Missing charset declaration: Can cause display issues with special characters
- Mismatched encoding: File saved in one encoding but declared as another
- Server encoding: Server not sending correct charset in HTTP headers
- Editor encoding: Text editor not saving files with UTF-8 encoding