A brief history of word embeddings (and some clarifications)

One of the strongest trends in Natural Language Processing (NLP) at the moment is the use of word embeddings, which are vectors whose relative similarities correlate with semantic similarity. Such vectors are used both as an end in itself (for computing similarities between terms), and as a representational basis for downstream NLP tasks like text classification, document clustering, part of speech tagging, named entity recognition, sentiment analysis, and so on. That this is a trend is obvious when looking at the proceedings from the recent large conferences in NLP, e.g. ACL or EMNLP. For the first time (ever), semantics was…


Gavagai and the most recent Greek election

As we have done previously, we again followed Greek editorial and social media in the days preceding last week’s parliamentary election in Greece. The opinion polls published in the weeks before suggested a neck-to-neck race between the incumbent socialist Syriza party and the main contender, the conservative Nea Demokratia. Our friend Haralampos Karatzas took our numbers for analysis on his blog on Greek politics (in Swedish). They showed a very different picture: Syriza garnered more attention. As previously, Syriza, as a controversial party, also was the focus of stronger expressions of sentiment than any other party. Last time around, we…


Gavagai at EMNLP 2015

We will present 3 research papers during this year’s EMNLP (Empirical Methods in Natural Language Processing) in Lisbon, Portugal. The first is a paper on comparing a support vector classifier to a lexicon-based approach for the task of detecting the stance categories speculation, contrast and conditional in English consumer reviews. The paper is called “Detecting Speculations, Contrasts and Conditionals in Consumer Reviews”, and will be presented on Thursday September 17 at the co-located workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA). The second paper investigates a method for factorizing distributional semantic models that produces state-of-the-art results on…


Gavagai at CLEF 2015

We presented a short paper at the 6th CLEF 2015 Conference and Labs of the Evaluation Forum in Toulouse on the deliberations from a workshop on Evaluating Learning Language Models we held last Fall with generous support from ELIAS. The presentation raised a fair bit of interest and several requests for a continuation workshop, and we are now motivated to continue by actually implementing the evaluation metrics suggested from the workshop.


Getting Started with Gavagai Explorer

If you have a collection of related texts and you want to know what themes they contain then Gavagai Explorer is for you. Your collection of texts could be the answers from an open-ended question in a survey for example. The analysis required to understand your answers typically involves a significant amount of work since you need to read each text and either remember the recurring themes or encode them in a topical framework. If you have a lot of answers this work will take a long time, but there are other problems as well: the coding process is error prone…