Language Technology (LT) is a sub-field of Artificial Intelligence (AI) concerned with enabling machines to `make sense' of human language. A particular application of LT that has gained widespread use over the recent years, both for scientific and commercial use, is Opinion Mining or Sentiment Analysis (SA). The task of an SA system is to automatically identify the opinions, attitudes or emotions that are expressed by subjective information in text. This technology has been successfully applied for market analysis, political opinion analysis, reputation tracking, customer relationship management, news and social media monitoring, and much more. The main objective of this project is to provide open and publicly available resources for sentiment analysis for the Norwegian language, something which is currently lacking. The project will take advantage of a peculiarity of the way reviews and critiques are typically summarized in Norwegian arts journalism and consumer journalism, viz. by an explicit rating on a scale 1-6, represented as a throw of a die. We here propose to use this feature for semi-automatically compiling a polarity labeled text collection. We can then use this to train and evaluate machine learned models for sentiment analysis on the document-level. For some applications it is necessary to have models that can make more granular predictions at the sentence-level and identify the targets and holders of the opinions (`who means what about whom'). To enable such models, a subset of the review will therefore be manually annotated with fine-grained in-sentence polarity information. In the field of AI in general, and LT in particular, the use of many-layered artificial neural networks (so-called Deep Learning) has recently seen as great revival with many successful applications, including sentiment analysis. The classifiers developed in this project will seek to push the state-of-the-art in large-scale sentiment analysis using deep neural architectures.
Project leader: Erik Velldal
Institution: Institutt for informatikk