Beyond Just Text: Semantic Emoji Similarity Modeling to Support Expressive Communication
Abstract
Emoji, a set of pictographic Unicode characters, have seen strong uptake over the last couple of years. All common mobile platforms and many desktop systems now support emoji entry, and users have embraced their use. Yet, we currently know very little about what makes for good emoji entry. While soft keyboards for text entry are well optimized, based on language and touch models, no such information exists to guide the design of emoji keyboards.
In this article, we investigate of the problem of emoji entry, starting with a study of the current state of the emoji keyboard implementation in Android. To enable moving forward to novel emoji keyboard designs, we then explore a model for emoji similarity that is able to inform such designs. This semantic model is based on data from 21 million collected tweets containing emoji. We compare this model against a solely description-based model of emoji in a crowdsourced study. Our model shows good performance in capturing detailed relationships between emoji.