Flags, Skin Tones, and ZWJ Sequences

Some of the most surprising character counts come from three special Unicode features: regional indicator symbols that make flags, Fitzpatrick skin tone modifiers, and Zero Width Joiners that glue everything together. Each of these was designed to let people express themselves more richly — but they also dramatically change how text length is calculated.

Country flags are not single characters stored in a font. Instead, they are created by combining two regional indicator letters. For example, the flag of Japan is made from the letters J and P placed side by side. When a system sees these two letters in sequence, it replaces them with the single flag. The result is one visible flag, two code points, four UTF-16 units, and eight UTF-8 bytes — all for what every user sees as one character.

Skin Tone Modifiers

When emojis representing people were first introduced, they used a default yellow tone. To support real human diversity, Unicode added five skin tone modifiers that can be attached to hands, faces, and full-body figures. Each modifier is a separate code point that follows the base emoji. A single person with dark skin tone therefore becomes two code points that must stay together to display correctly.

Combining Everything

The most complex emojis use all three mechanisms at once. A woman with medium-dark skin tone holding hands with a boy with light skin tone uses ZWJ to connect them, plus separate skin modifiers for each person. What appears as one beautiful image is actually seven or more code points working in perfect coordination.

Why Joiners Are Invisible

Zero Width Joiner has no width and no visible glyph. Its only job is to tell the text rendering engine: “treat these separate emojis as one single unit.” Without it, the family would appear as four separate people standing apart. With it, they become one family — and one grapheme cluster.

The Result

These innovations make digital communication more inclusive and expressive than ever before. But they also mean that any system still counting “characters” the old way will be dramatically wrong. A single flag, a person with skin tone, or a joined family are all one thing to a human — and should be counted as one by software.

True inclusivity in text isn’t just about what we display — it’s about measuring it correctly too.