Distributional Semantics


The representation of word meaning in texts is a central problem in Computational Linguistics. When language learning techonologies are applied to generalize linguistic observations of the target phenomena, the information on the meaning of words plays a crucial role in the quality of the underlying statistical models. When the availability of training data is scarce, pure lexical information can be affected by data sparseness and a generalization is then needed.

The distributional analysis of large-scale corpora is the instrument we apply to acquire and generalize lexical information. It represents a general learning algorithm providing an effective lexical generalization without a strong dependency from hand built resources. In this view, words are modeled according to a geometrical perspective, i.e. points in a high-dimensional space, in a way that similar or related concepts are near in the space.


Roberto BasiliDanilo Croce

Related Projects

Wordspace page


SAG Publications

