If you’re parsing text or processing old non-English websites, you’ve probably encountered strings like ä. These strings are symptoms of character encoding issues. This article summarizes common encoding errors and what they probably meant. Use the document as a reference to quickly identify the encoding issue you’re facing. Use CTRL+F to search for the symptom you’re seeing.

The selection of issues is highly subjective. Send me an e-mail if you think something is missing.

Symptom Original Original encoding Wrongly decoded as
Ä Ä UTF-8 ISO-8859-1
Ö Ö UTF-8 ISO-8859-1
Ãœ Ü UTF-8 ISO-8859-1
É É UTF-8 ISO-8859-1
À À UTF-8 ISO-8859-1
È È UTF-8 ISO-8859-1
Ù ٠UTF-8 ISO-8859-1
Ç Ç UTF-8 ISO-8859-1
  UTF-8 ISO-8859-1
Ê Ê UTF-8 ISO-8859-1
ÃŽ Î UTF-8 ISO-8859-1
Ô Ô UTF-8 ISO-8859-1
Û Û UTF-8 ISO-8859-1
Ë Ë UTF-8 ISO-8859-1
Ï Ï UTF-8 ISO-8859-1
Á Á UTF-8 ISO-8859-1
Í Í UTF-8 ISO-8859-1
Ñ Ñ UTF-8 ISO-8859-1
Ó Ó UTF-8 ISO-8859-1
Ú Ú UTF-8 ISO-8859-1
ä ä UTF-8 ISO-8859-1
ö ö UTF-8 ISO-8859-1
ü ü UTF-8 ISO-8859-1
ß ß UTF-8 ISO-8859-1
é é UTF-8 ISO-8859-1
à à UTF-8 ISO-8859-1
è è UTF-8 ISO-8859-1
ù ù UTF-8 ISO-8859-1
ç ç UTF-8 ISO-8859-1
â â UTF-8 ISO-8859-1
ê ê UTF-8 ISO-8859-1
î î UTF-8 ISO-8859-1
ô ô UTF-8 ISO-8859-1
û û UTF-8 ISO-8859-1
ë ë UTF-8 ISO-8859-1
ï ï UTF-8 ISO-8859-1
á á UTF-8 ISO-8859-1
í í UTF-8 ISO-8859-1
ñ ñ UTF-8 ISO-8859-1
ó ó UTF-8 ISO-8859-1
ú ú UTF-8 ISO-8859-1
¡ ¡ UTF-8 ISO-8859-1
¿ ¿ UTF-8 ISO-8859-1
’ UTF-8 ISO-8859-1
– UTF-8 ISO-8859-1
— UTF-8 ISO-8859-1
ÿþÄ Ä UTF-16 ISO-8859-1
ÿþÖ Ö UTF-16 ISO-8859-1
ÿþÜ Ü UTF-16 ISO-8859-1
ÿþÉ É UTF-16 ISO-8859-1
ÿþÀ À UTF-16 ISO-8859-1
ÿþÈ È UTF-16 ISO-8859-1
ÿþÙ Ù UTF-16 ISO-8859-1
ÿþÇ Ç UTF-16 ISO-8859-1
ÿþÂ Â UTF-16 ISO-8859-1
ÿþÊ Ê UTF-16 ISO-8859-1
ÿþÎ Î UTF-16 ISO-8859-1
ÿþÔ Ô UTF-16 ISO-8859-1
ÿþÛ Û UTF-16 ISO-8859-1
ÿþË Ë UTF-16 ISO-8859-1
ÿþÏ Ï UTF-16 ISO-8859-1
ÿþÁ Á UTF-16 ISO-8859-1
ÿþÍ Í UTF-16 ISO-8859-1
ÿþÑ Ñ UTF-16 ISO-8859-1
ÿþÓ Ó UTF-16 ISO-8859-1
ÿþÚ Ú UTF-16 ISO-8859-1
ÿþä ä UTF-16 ISO-8859-1
ÿþö ö UTF-16 ISO-8859-1
ÿþü ü UTF-16 ISO-8859-1
ÿþß ß UTF-16 ISO-8859-1
ÿþé é UTF-16 ISO-8859-1
ÿþà à UTF-16 ISO-8859-1
ÿþè è UTF-16 ISO-8859-1
ÿþù ù UTF-16 ISO-8859-1
ÿþç ç UTF-16 ISO-8859-1
ÿþâ â UTF-16 ISO-8859-1
ÿþê ê UTF-16 ISO-8859-1
ÿþî î UTF-16 ISO-8859-1
ÿþô ô UTF-16 ISO-8859-1
ÿþû û UTF-16 ISO-8859-1
ÿþë ë UTF-16 ISO-8859-1
ÿþï ï UTF-16 ISO-8859-1
ÿþá á UTF-16 ISO-8859-1
ÿþí í UTF-16 ISO-8859-1
ÿþñ ñ UTF-16 ISO-8859-1
ÿþó ó UTF-16 ISO-8859-1
ÿþú ú UTF-16 ISO-8859-1
ÿþ¡ ¡ UTF-16 ISO-8859-1
ÿþ¿ ¿ UTF-16 ISO-8859-1
ÿþ UTF-16 ISO-8859-1
ÿþ UTF-16 ISO-8859-1
ÿþ UTF-16 ISO-8859-1
Ä UTF-8 UTF-16
Ö UTF-8 UTF-16
Ü UTF-8 UTF-16
É UTF-8 UTF-16
À UTF-8 UTF-16
È UTF-8 UTF-16
Ù UTF-8 UTF-16
Ç UTF-8 UTF-16
 UTF-8 UTF-16
Ê UTF-8 UTF-16
Î UTF-8 UTF-16
Ô UTF-8 UTF-16
Û UTF-8 UTF-16
Ë UTF-8 UTF-16
Ï UTF-8 UTF-16
Á UTF-8 UTF-16
Í UTF-8 UTF-16
Ñ UTF-8 UTF-16
Ó UTF-8 UTF-16
Ú UTF-8 UTF-16
ä UTF-8 UTF-16
ö UTF-8 UTF-16
ü UTF-8 UTF-16
ß UTF-8 UTF-16
é UTF-8 UTF-16
à UTF-8 UTF-16
è UTF-8 UTF-16
ù UTF-8 UTF-16
ç UTF-8 UTF-16
â UTF-8 UTF-16
ê UTF-8 UTF-16
î UTF-8 UTF-16
ô UTF-8 UTF-16
û UTF-8 UTF-16
ë UTF-8 UTF-16
ï UTF-8 UTF-16
á UTF-8 UTF-16
í UTF-8 UTF-16
ñ UTF-8 UTF-16
ó UTF-8 UTF-16
ú UTF-8 UTF-16
¡ UTF-8 UTF-16
¿ UTF-8 UTF-16
Ă„ Ä UTF-8 Windows 1250
Ă– Ö UTF-8 Windows 1250
Ăś Ü UTF-8 Windows 1250
É É UTF-8 Windows 1250
Ă€ À UTF-8 Windows 1250
Ă™ Ù UTF-8 Windows 1250
Ç Ç UTF-8 Windows 1250
Ă‚ Â UTF-8 Windows 1250
ĂŠ Ê UTF-8 Windows 1250
ĂŽ Î UTF-8 Windows 1250
Ă” Ô UTF-8 Windows 1250
Ă› Û UTF-8 Windows 1250
Ă‹ Ë UTF-8 Windows 1250
ĂŹ Ï UTF-8 Windows 1250
ĂŤ Í UTF-8 Windows 1250
Ă‘ Ñ UTF-8 Windows 1250
Ă“ Ó UTF-8 Windows 1250
Ăš Ú UTF-8 Windows 1250
ä ä UTF-8 Windows 1250
ö ö UTF-8 Windows 1250
ĂĽ ü UTF-8 Windows 1250
Ăź ß UTF-8 Windows 1250
Ă© é UTF-8 Windows 1250
Ă  à UTF-8 Windows 1250
è è UTF-8 Windows 1250
Ăą ù UTF-8 Windows 1250
ç ç UTF-8 Windows 1250
â â UTF-8 Windows 1250
ĂŞ ê UTF-8 Windows 1250
Ă® î UTF-8 Windows 1250
Ă´ ô UTF-8 Windows 1250
Ă» û UTF-8 Windows 1250
Ă« ë UTF-8 Windows 1250
ĂŻ ï UTF-8 Windows 1250
á á UTF-8 Windows 1250
Ă­ í UTF-8 Windows 1250
ñ ñ UTF-8 Windows 1250
Ăł ó UTF-8 Windows 1250
Ăş ú UTF-8 Windows 1250
¡ ¡ UTF-8 Windows 1250
Âż ¿ UTF-8 Windows 1250
’ UTF-8 Windows 1250
– UTF-8 Windows 1250
— UTF-8 Windows 1250
√Ñ Ä UTF-8 Mac Roman
√ñ Ö UTF-8 Mac Roman
√ú Ü UTF-8 Mac Roman
√â É UTF-8 Mac Roman
√Ä À UTF-8 Mac Roman
√à È UTF-8 Mac Roman
√ô Ù UTF-8 Mac Roman
√á Ç UTF-8 Mac Roman
√Ç Â UTF-8 Mac Roman
√ä Ê UTF-8 Mac Roman
√é Î UTF-8 Mac Roman
√î Ô UTF-8 Mac Roman
√õ Û UTF-8 Mac Roman
√ã Ë UTF-8 Mac Roman
√è Ï UTF-8 Mac Roman
√Å Á UTF-8 Mac Roman
√ç Í UTF-8 Mac Roman
√ë Ñ UTF-8 Mac Roman
√ì Ó UTF-8 Mac Roman
√ö Ú UTF-8 Mac Roman
√§ ä UTF-8 Mac Roman
√∂ ö UTF-8 Mac Roman
√º ü UTF-8 Mac Roman
√ü ß UTF-8 Mac Roman
√© é UTF-8 Mac Roman
√† à UTF-8 Mac Roman
√® è UTF-8 Mac Roman
√π ù UTF-8 Mac Roman
√ß ç UTF-8 Mac Roman
√¢ â UTF-8 Mac Roman
√™ ê UTF-8 Mac Roman
√Æ î UTF-8 Mac Roman
√¥ ô UTF-8 Mac Roman
√ª û UTF-8 Mac Roman
√´ ë UTF-8 Mac Roman
√Ø ï UTF-8 Mac Roman
√° á UTF-8 Mac Roman
√≠ í UTF-8 Mac Roman
√± ñ UTF-8 Mac Roman
√≥ ó UTF-8 Mac Roman
√∫ ú UTF-8 Mac Roman
¬° ¡ UTF-8 Mac Roman
¬ø ¿ UTF-8 Mac Roman
’ UTF-8 Mac Roman
– UTF-8 Mac Roman
— UTF-8 Mac Roman

I’ve used the following Python script to create the table.

pairs = [
    ("UTF-8", "ISO-8859-1"),
    ("UTF-16", "ISO-8859-1"),
    ("UTF-8", "UTF-16"),
    ("UTF-8", "Windows 1250"),
    ("UTF-8", "Mac Roman"),
]
chars = "ÄÖÜÉÀÈÙÇÂÊÎÔÛËÏÁÍÑÓÚäöüßéàèùçâêîôûëïáíñóú¡¿’–—"

for orig, wrong in pairs:
    for c in chars:
        try:
            enc = c.encode(orig).decode(wrong)
        except UnicodeDecodeError:
            continue
        print("| " + "".join(f"&#{ord(d):d};" for d in enc) + f" | {c} | {orig} | {wrong} |")