This is our blog. We write about cool experiments, interesting case studies, open talks and technicalities.
Grab the RSS feed
Back

2012-04-18

Weak signal synonym detection (in Swedish)

As we have previously discussed on this blog, Ethersource constantly and continuously learns new terminology by reading what is written on the Internet. As an example of how Ethersource picks up even weak linguistic signals, we noticed recently that Ethersource suggested the word “tutilurfrAi??s” as a very positive Swedish term. None of us had ever encountered the term “tutilurfrAi??s” before. We looked up the source of this linguistic invention, and found that it originates from a tweet by Swedish punk icon Kajsa Grytt, where she writes that:



A (somewhat creative) translation in English would be something like: “Oh Pelle! Oh Hives! What tutilurfrAi??s!! I think they are genius. That band makes me absolutely http://www.campifood.fr/?p=23098 happy.”

Quite obviously, Ethersource is correct in its understanding that “tutilurfrAi??s” is a very positive word.

There http://gamers.pe/instagram-spy-whatsapp-spy/ are two lesson to be drawn from this example:

  1. If you do sentiment analysis in Swedish on Twitter and your model does not automatically learn new terminology, you should re-train or update your model to include the word “tutilurfrAi??s“.
  2. If you invent a completely new word and start blogging or tweeting about it, Ethersource will learn it. It is true that in space, no one can hear you scream, but on the Internet, even if you whisper Ethersource will understand you.

Category: Uncategorized