Teach Machines to Feel: Emoticons & Deep Learning

Lei Feng network (search "Lei Feng network" public concern) : CSDN Zhou Jianding editor, chat robot Dango is based on neural network development, developers have used millions of examples of neural network training, so that the program to better understand the expression The meaning of the symbol.

Recently, neural networks have become a choice to solve a series of computer science problems: Facebook uses neural networks to identify faces in images, and Google uses them to identify everything in images. Apple uses them to understand what you say to Siri, and IBM uses it to manipulate the synergies of business units.

This is so impressive. But what about practical problems? Can a neural network help you find any emoji you need when you need it?

Oh, yes. They can.

This article will outline some of the engineering principles behind Dango, which allows us to automatically learn from the use of hundreds of millions of real-world symbolic expressions, resulting in a lightweight and fast tool for real-time on mobile phones. Predict emojis for you.


What is Dango?

Dango is a floating assistant that runs on the phone and predicts emojis, stickers, and GIFs based on what you and your friends wrote in any app. This allows you to have the same rich conversation in any application: Messenger, Kik, Whatsapp, Snapchat and more. (It is a big challenge to implement such a function in all applications, but this is not part of the discussion here).



Recommending emoticons is difficult: Dango must understand the meaning of the statement you are writing, and then recommend the expression you want to use for you. At the core, Dango's predictions are implemented by neural networks. A neural network is a computational structure with millions of tunable parameters connected to each other in a manner similar to the connection of human brain neurons.

Randomly initialize these parameters to train the neural network and then enter millions of real-world emoji use cases that are grabbed from the Internet, such as:


In the beginning, the network was just guessing at random, but with the input of new training examples, it slightly adjusted the millions of parameters, so it would perform better on that sample. After a few days of training on the top GPU, the network began to output more meaningful recommendations:


What we learned from emoji

This data-driven emoji prediction method means Dango knows emoticons better than us. Dango teaches us to use new slang words and new ways people around the world use emoji to tell stories.

For example: If you write "Kanye is the", Dango will predict a goat's expression. This goat certainly represents the Greatest of All Time (GOAT), which Kanye claimed earlier this year:

He said when he realized that he was the greatest artist of contemporary times and of all times.

— KANYE WEST (@kanyewest)2016-02-14


Dango can express things by using multiple emoticons. For example, if you live in British Columbia or Colorado and enjoy life, Dango will recommend it.

If you are angry with someone, hope that they get out. Dango will be happy to let them out:

Dango also learned a wealth of knowledge from online culture. It understands memes and trends. For example, if you look at the pictures of the Kermit frog drinking tea, "but that's none of my business," then you know.




Dango also understands many other minor references and jokes, and it is always learning to keep up with trends


Of course, there are many we have not found.

More than just emoji

Since Dango trained on emojis, he may initially realize that the number of concepts it can understand and represent is very small - at the time of writing this article, the Unicode Consortium had 1,624 standardized emojis, even for type designers. This number is a headache, but it is still relatively small.

However, this does not mean that there are only 1624 meanings. When you use emoticons, their meaning is defined according to their appearance and context of use - these meanings are highly diversified.

May indicate "raise your hand" or "Thank you" or "Please."

May specifically mean eggplant.


In addition, emojis can be used in combination to express new concepts. E.g:

Said to kiss the cheek, but


Said whistling,

Spit smoke.


The combination of these emojis can become very complicated:


This means that Dango can represent more semantic concepts than a single emoticon can represent. This is a powerful concept because it gives Dango a way to understand a variety of general concepts, regardless of whether the Unicode Association recognizes it.

Dango can therefore also recommend maps and GIFs. As already mentioned above, Dango can understand to get out:

It can also recommend GIF for you:



understand deeper

Let us learn more about how it works.

A simple method of recommending emojis (the first we tried in Dango) can directly map some vocabulary to something like the following emoji:

However, this method has limitations and it does not reflect the actual use of emoticons (and languages). Many subtle combinations of words cannot be described by simple mapping.


To deal with these situations, Dango used Recurrent Neural Networks (RNN). RNN is a special neural network architecture, which is very suitable for continuous input, so it is often used in the fields of natural language processing, speech processing and financial time series analysis. Here I will quickly go over what is RNN. For a deeper understanding, look at a great overview of Andrej KarPathy.


The RNN handles sequential input by maintaining an internal state, a memory mechanism that enables them to track previously seen data. This is very important to distinguish between my very happy expression and my unhappy expression.

Multiple RNNs can also be stacked on top of each other: each RNN receives the input sequence, then converts it to a new, more abstract representation, then inputs it to the next RNN, and so on. The deeper these networks are stacked, the more complex the types of functionality they can represent. By the way, this is the origin of the now popular "deep learning". Some of the major breakthroughs in the puzzle are due to the simple use of deeper network layer stacking.

Dango's neural network eventually outputs a list of hundreds of digits. This list can be represented as a point in a high-dimensional space, just as three numbers can represent the x, y, and z coordinate values ​​of the point in three-dimensional space.

We can call this high-dimensional space a semantic space and think of it as a multi-dimensional grid, with different points representing different ideas. In this space, similar ideas are closer. Deep learning pioneer Geoff Hinton called this space "thinking vector." What Dango learned during training is how to translate natural language sentences and emojis into separate vectors in this space.

So, when Dango receives a text, it maps the text into this semantic space. To decide which emojis to recommend, it projects a vector of each emoji onto this semantic vector. Projection is a simple operation that gives a measure of the similarity of two vectors. Then Dango recommended the emoticon with the longest projection - these are the closest emoticons to the meaning of the input text.

Visualized semantic space

For those who think through vision, this spatial metaphor is a powerful tool that helps us understand and talk about neural networks through intuition. (In Whirlscape, we are very addicted to spatial metaphors; look at our earlier article on the Minuum keyboard algorithm).

To help us imagine Dango's semantic space, we can use a popular technique for visualizing high-dimensional space, called T-distribution random neighbor embedding, abbreviated t-SNE. This technique attempts to place each high-dimensional point into a two-dimensional space and ensure that adjacent points in the original space remain in the two-dimensional space. Although this mapping is not perfect, it can still tell us a lot of information. We now use t-SNE to visualize emojis in the semantic space:

Open an interactive map and explore

Notice how semantically similar emojis are automatically grouped together in this space. For example, most facial expressions gather in the "face peninsula" area.

Happy expression in a zone:

Angry expression in another area.

All the heart-shaped expressions are gathered near the right side of the mountain. We call it "Point Love."

Looking back at the tail, you can find some other interesting groups: basketball, rugby, volleyball, and football are all close to each other, and facial expressions with hair and facial expressions without hair are separated from each other (this with them Whether you want to go out. At the far right end, you can see some banner expressions and some unpopular expressions, such as filing cabinets and fast forwards.

In addition, Dango has never been clearly told that faces and heart shapes, or beer, or rural animals are different. Dango uses a sample of hundreds of millions of real-world expressions from the Internet to train the model, and then generates this semantic map. So what do we mean by training here?

Before training, initialize a neural network; enter some more or less random values; essentially start with a clean state. The sentences are randomly mapped to the semantic space, where the emojis are randomly scattered.

To train a neural network, we define an objective function; this is basically a way of evaluating the performance of the network for a given sample. The objective function outputs a fractional value that tells Dango how well the sample predicts the situation. The smaller the score, the better. Then we use a very simple algorithm called gradient descent. For each training example, the gradient descent adjusts slightly in the direction of the objective function, slightly adjusting the value of millions of parameters in the neural network.

After a few days of training on the GPU, the target function cannot be further optimized - Dango has already completed training and can come in handy!

The future of language

Language is becoming visualized. Emoticons, maps, and GIFs are very popular, although using them in advanced ways still requires a lot of labor. Expression lovers created a personal favorites image for each situation and remembered each page of the emoji keyboard, but the rest of us rely on the "most used" menu to immediately use emoticons, sometimes GIFs.

This kind of visual language has matured with technology, and this kind of symbiotic relationship will continue. New technologies will lead to new languages. In turn, new languages ​​will spawn new technologies. Future communications will use artificial intelligence tools to help you seamlessly connect images and texts. Dango is proud of being at the forefront of this.

I hope you can get inspiration from it, and like us, project your statement somewhere in the semantic space, surrounded by hundreds of emojis. Maybe you start to play with your own neural network. Please let us know!

Finally, try our Dango and give feedback. Whenever you wonder: What expression should I use? Dango will give you the answer.

Grey Air Purifier

Product description:
Easy to Use Air Purifier Simple controls with variable speed knob to dial in the exact air flow No distracting lights to keep you up at night No complicated user interface Easy filter access Don't need a manual to operate the unit Casters make it easy to push around.

Grey Air Purifier,Air Cleaner Purifier,Industrial Air Purifier,Commercial Air Purifier

Ningbo Huayou Intelligent Technology Co. LTD , https://www.homeapplianceshuayou.com

Posted on