Emoticons and Emoji in SMS messages

Publication date: 2019-06-03
Last update: 2019-12-24
Author: Victor Celer

 

The SMS messages can include smileys, flags and other types of pictograms, for example:

SMS with emoticons

Most Unicode-capable phones, especially the ones manufactured in 2010 and later, do support at least the most generic emoticons and emoji. So, if the phone can display non-Latin characters (i.e. chinese, cyrillic) it is very likely to support emoticons as well. The following table summarizes the emoticons support in varios smartphones:

Android Supports most emoticons and emoji since Android 4.3.
iOS Supports most emoticons and emoji since iPhone 4.
Windows Phone Supports most emoticons and emoji, except flags.

 

In order to support emoticons the SMS encoding must be set to UCS2 Big Endian. It means that every character and every emoticon in the text message will be encoded in 2 bytes. Even if most of the message consists of Latin characters they will still use 2 bytes due to UCS2 encoding. The maximum SMS length is 140 bytes in any GSM/UMTS/LTE network. If the message contains emoticons the maximum length will be 140 / 2 = 70 characters. Keep in mind that flags are encoded as 2 characters = 4 bytes because the encoding is based on "regional indicators" defined in ISO 3166-1. If the handset supports Unicode, but it doesn't support the specific flag, the regional indicator will be displayed instead.

If the message length exceeds the maximum SMS capacity it's possible to use concatenation. Sometimes, especially in roaming, concatenated Unicode SMS may appear garbled on screen. This problem happens because most phones start to display the message as soon as the first fragment is received, which is not necessarily the first fragment of the sequence. The network can change the order of the fragments during the delivery and some fragments can arrive out of order, duplicate or even drop. In order to prevent this kind of problems it can be useful to avoid concatenation. For example, a long text message can be split into several independent (non-concatenated) messages.

If you want to compose an SMS containing emoticons in your favorite text editor make sure Unicode is supported and the encoding is set to UCS2 Big Endian. For example, the freeware editor Notepad++ supports this feature:

Notepad++ screen

There is a clickable table containing various emoticons. Just click the emoticon and use Ctrl+C / Ctrl+V to paste it into the text editor.

If you view the file using your favorite hex editor you can notice a leading FEFF, which is the Byte Order Mark (BOM). The BOM must not be included into the SMS message! The rest of the file is exactly what should be posted as the SMS payload.

UCS2 with BOM in hexa

Remember to test your messages using a real handset or several different handsets because the look & feel of the emoticons is very handset dependent.