If you encounter something that you don’t understand, just click the mouse to view the specific definition.
It’s not just Chinese, for example, when you want to blurt out the phrase “goose girl 嘤”, but you want to know if there is a more gorgeous Chinese expression, you can also get it with one click.
How is it, convenient enough? Is it a little “mommy don’t have to worry about my lack of words” anymore (manual dog head).
“Reverse Dictionary” from Tsinghua University
This artifact is called WantWords, a reverse dictionary.
The AI behind it has a lot of background: it was born from the Natural Language Processing and Social Humanities Computing Laboratory of Tsinghua University. The project instructors are Professor Sun Maosong and Associate Professor Liu Zhiyuan. The so-called “reverse” means that, unlike conventional dictionaries, it does not search for meanings by words, but in turn gives the dictionary a description and lets it help you find words.
The authors introduced in GitHub that they hope the reverse dictionary can play three roles:
The solution came to my mouth, but suddenly I couldn’t remember how to say the “tip of the tongue phenomenon”
Help new language learners
Helping dyslexic people who can’t choose words
The core AI behind this reverse dictionary is called a multi-channel reverse dictionary model, and related papers have also been selected in AAAI 2020.
Specifically, the multi-channel inverse dictionary model adopts bidirectional LSTM (BiLSTM) and attention as the basic framework, and incorporates 4 specific feature predictors into it. Using multiple predictors to identify different features of the target words in the input query, on the one hand, enables the target words with poor embedding quality to be selected through the features. On the other hand, it is also possible to filter out words that have close embeddings to the correct target word but have contradictory features.
In other words, AI word selection can be more accurate.
In order to make it easier for AI to find the truly “correct” word, in addition to the “internal features” of the two words, part of speech and morpheme, the author also considered two “external features” of hierarchy and sememe.
The so-called hierarchical system is used to distinguish whether a word is an entity or a concept, and there are various entities under the entity.
Sememe in linguistics refers to the smallest indivisible semantic unit. Linguists believe that the sememe system applies in any language and is not related to a particular language.
For example, the word “boy” can be expressed by the three meanings “human”, “male” and “child”, and “girl” can be expressed by the combination of “human”, “female” and “child” .
△ Source: HowNet
The new algorithm has been tested, and the relevant new system is under development
As mentioned above, the WantWords Reverse Dictionary was first born in the Tsinghua NLP laboratory, and was mainly completed by Qi Fanchao and Zhang Lei in 2019.
When communicating with Guoke, Qi Fanchao said that at the beginning, they did not promote this project, but the feedback from the students around them was not bad after using it. Until November last year, the project suddenly became popular, and the traffic surged for a while, crowding out the server. Since then, WantWords has begun to receive more attention, as well as a lot of advice and technical support from volunteers.
Not only the web version, but also the WeChat applet has been officially launched, and the app version is under development.
△ WeChat applet “WantWords”
According to the latest announcement from the R&D team, before New Year’s Eve this year, the new algorithm for reverse word search was also tested, and its performance was significantly improved compared to the original algorithm. In addition to the reverse dictionary, the research team also developed a “semantic retrieval and recommendation system for famous sayings and sentences” and a “Chinese word collocation query system”.
At present, these two systems have not been opened to the outside world. Interested friends can squat while reading the paper (provided at the end of the article).
By the way, the R&D team also said that WantWords, as an open source project, welcomes everyone to join at any time, participate in design & development, put forward requirements, and feedback questions. If you are interested, please go to the official website to poke the announcement~
Related papers:
https://arxiv.org/abs/1912.08441
https://arxiv.org/abs/2202.13145
Reference link:
[1] Official website: https://wantwords.net/
[2] Nutshell article: https://mp.weixin.qq.com/s/er-JwST7dUQjMh6VzBE1bA
[3]https://deeplang.feishu.cn/docs/doccnoH9ncCZspo2Ubx79bpZ0Lh#ijyigh