Why JavaScript string.length Lies to You

For over twenty years, JavaScript developers have used string.length to count characters. It worked perfectly for English text using only basic Latin letters. But the moment someone types an emoji, an accented character, or a flag, the result becomes completely wrong. This isn’t a bug in JavaScript — it’s a deliberate design based on how text encoding from the 1990s.

JavaScript strings are stored as sequences of UTF-16 code units. Each unit is 16 bits, which was enough for most characters when the web was young. But Unicode grew far beyond that limit. Characters outside the Basic Multilingual Plane — including nearly all emojis — require two code units instead of one. This means a single visible emoji can take up two, four, or even ten positions in a JavaScript string.

The Most Common Surprise

Type a flag emoji into any browser console and run .length. You’ll get 4, even though you clearly see only one flag. Do the same with a family emoji and you might get 11, 20, or more — depending on skin tones. Even the simple letter e with a combining accent returns 2 instead of 1. These results feel like lies, but they’re actually following the rules of UTF-16 encoding perfectly.

What Developers Get Wrong

Many applications still use string.length for validation: limiting usernames to 15 characters, truncating tweets, or enforcing message length. When users with international names or emoji in their display name hit these limits unexpectedly, they blame the platform — but the real issue is using the wrong measurement for modern text.

The Correct Modern Approach

Today, the web has a proper solution: the Internationalization API’s Segmenter. This built-in browser feature understands the full Unicode grapheme cluster rules and returns exactly what a human would count. It knows that a base character plus three combining marks is still one visible character. It understands that a man, woman, girl, boy, and three invisible joiners form a single family emoji.

Moving Forward

Accurate character counting is no longer optional. As text becomes more diverse and expressive, systems that rely on outdated methods will increasingly fail their users. The good news is that the solution is now built into every modern browser and easy to use. Tools like this one exist to help developers and users alike understand the true length of text in the Unicode era.

The future of text handling is here — and it counts characters the way people do.