Skip to main content

How do I handle different character sets (e.g., Unicode)?

UTF-8 encoding is your universal solution. Declare it explicitly in your email headers (Content-Type: text/html; charset=UTF-8) and in your HTML meta tag (<meta charset=\"UTF-8\">). UTF-8 can represent virtually every character from every modern writing system-Latin, Cyrillic, Arabic, Chinese, Japanese, Korean, and the full emoji catalog. There's no legitimate reason to use older encodings like ISO-8859-1 for new email projects.

Consistency throughout your pipeline is critical. Your database must store data in UTF-8, your application must handle it correctly, and your ESP must send it properly encoded. A character set mismatch anywhere in this chain produces the dreaded \"mojibake\"-garbled characters that make your email look broken. Common symptoms include question marks, empty boxes, or strange multi-character sequences where simple text should appear.

Special characters in personalization fields need particular attention. A subscriber named François or 田中 should see their name rendered correctly, not corrupted. Test with actual international names in your system, not just ASCII placeholders. UTF-8 has been the web standard for two decades. If uyour email stack doesn't support it cleanly end-to-end, that's technical debt worth addressing immediately.