Kernel Methods

Introduction

In machine learning, Kernel Methods are a class of algorithms for pattern analysis such as Support Vector Machine or Online Learning algorithms. A kernel function allows to express the similarity between two objects, that are explanatory of a target problem, in rich representation spaces. They implicitly map an example in a new (richer) feature space, where examples could become separable exploiting kernel functions and the so called “kernel trick”. It means that a kernel function enable to compute the inner product in a richer (implicit) space without ever computing the coordinates of the data in that space. This operation is often computationally cheaper than the explicit computation of the coordinates. Kernel functions have been introduced for sequence data, graphs, text, images, as well as vectors.

In NLP kernel functions have been employed in order to provide a statistical model able to separate the problem representation from the learning algorithm, for example Sequence Kernels or Tree Kernels. The main idea of this kind of functions is that the algorithm can effectively learn the target phenomenon by only focusing on the notion of similarity among observations that are not fully expressed in the representation. A linguistic phenomenon can thus be modeled at a more abstract level making the modeling process easier.


People

Roberto Basili, Danilo Croce,


Related Projects


References

Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA.

Michael Collins and Nigel Duffy. 2002. New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL ’02). Stroudsburg, PA, USA, 263-270


SAG Publications

Paolo Annesi, Danilo Croce, Roberto Basili (2013): Towards Compositional Tree Kernels. In: Joint Symposium on Semantic Processing., pp. 15, 2013.

Danilo Croce, Alessandro Moschitti, Roberto Basili (2011): Structured Lexical Similarity via Convolution Kernels on Dependency Trees. In: EMNLP, pp. 1034-1046, 2011.

Alessandro Moschitti, Daniele Pighin, Roberto Basili (2008): Tree kernels for semantic role labeling. In: Computational Linguistics, 34 (2), pp. 193–224, 2008.

Alessandro Moschitti, Daniele Pighin, Roberto Basili (2006): Tree kernel engineering in semantic role labeling systems. In: Proceedings of the Workshop on Learning Structured Information in Natural Language Applications, EACL 2006, pp. 49–56, 2006.