Emoji Validation
Introduction
I have never given much thought to emoji validation before, so I will summarize the specifications and validation methods for emojis.
About Emojis
Emojis are special characters used to display pictures within text.
😳
For example, as shown above, you can display a face illustration. Incidentally, the term “emoji” is also used in English and is widely recognized worldwide as a word of Japanese origin.
Emojis can be represented using Unicode. Furthermore, by using ZWJ sequences, you can create emojis consisting of various combinations.
For instance, the family emoji 👨👩👦 is created by combining the man 👨 (U+1F468), ZWJ (U+200D), and woman 👩 (U+1F469).
The Zero Width Joiner (ZWJ) is represented in Unicode as U+200D.
Validation Methods
You can detect emojis by retrieving their Unicode using regular expressions.
Each emoji category has a somewhat defined Unicode range, and these ranges can be handled with regular expressions. However, since emojis may include not only simple code points but also modifiers and joiners, it’s essential to consider these elements for complete detection.
For example, you can specify a particular range in a regular expression as shown below:
const emojiRegex = /[\u2600-\u26FF\u1F300-\u1F5FF\u1F600-\u1F64F\u1F680-\u1F6FF\u1F1E6-\u1F1FF\u1F900-\u1F9FF]/;
Additionally, Unicode provides “Emoji Properties” to define the classification and characteristics of emojis. These properties help identify emojis and their modifiers, enabling filtering and operations tailored to specific use cases.
\p{Emoji} matches characters with the “Emoji Property” in Unicode.
// Example Usage:
const emojiRegex = /[\p{Emoji}]/u;
function containsEmoji(text) {
return emojiRegex.test(text);
}
console.log(containsEmoji("こんにちは🌟")); // true
console.log(containsEmoji("こんにちは")); // false
There are also other properties, such as Emoji_Presentation, which represents emoji expressions, and Emoji_Modifier, which handles emoji modifiers. However, the Emoji Property may only be supported in modern browsers and might fail to detect certain emojis.
Emoji Validation Libraries
Here are some commonly used JavaScript libraries for emoji validation:
- emoji-regex Provides regular expressions compatible with Unicode emojis.
const emojiRegex = require('emoji-regex');
const regex = emojiRegex();
console.log(regex.test('😀')); // true
console.log(regex.test('A')); // false
- zod The validation library zod can also validate emojis.
const emojiSchema = z.string().emoji();
console.log(emojiSchema.parse('😀')); // true
console.log(emojiSchema.parse('A')); // false
Conclusion
Until now, I’ve been using emojis without much thought, but upon researching, I discovered many intricate aspects and gained a deeper understanding of Unicode. It was a very educational experience.