Archive for December 2015

The ECIR 2016 paper has been accepted!

The paper “Large-scale Kernel-based Language Learning through the Ensemble Nystrom methods” by Danilo Croce and Roberto Basili has been accepted at the 8th European Conference on Information Retrieval (ECIR 2016) that will be held on 20-23 March 2016 in Padua, Italy (acceptance rate: 21%)

The list of accepted paper can be browsed at this link.

Abstract: Kernel methods have been used by many Machine Learning paradigms, achieving state-of-the-art performances in many Language Learning tasks. One drawback of expressive kernel functions, such as Sequence or Tree kernels, is the time and space complexity required both in learning and classification. In this paper, the Nystrom methodology is studied as a viable solution to face these scalability issues.
By mapping data in low-dimensional spaces as kernel space approximations, the proposed methodology positively impacts on scalability through compact linear representation of highly structured data. Computation can be also distributed on several machines by adopting the so-called Ensemble Nystrom Method.
Experimental results show that an accuracy comparable with state-of-the-art kernel-based methods can be obtained by reducing of orders of magnitude the required operations and enabling the adoption of datasets containing more than one million examples.

KeLP 2.0.0 released

The 2.0.0 version of KeLP has been released. This is a major release that includes brand new features as well as a renewed architecture of the entire project.

Now KeLP is organized in four maven projects:

  1. kelp-core: it contains the infrastructure of abstract classes and interfaces to work with KeLP. Furthermore, some implementations of algorithms, kernels and representations are included, to provide a base operative environment.
  2. kelp-additional-kernels: it contains several kernel functions that extend the set of kernels made available in the kelp-core project.
  3. kelp-additional-algorithms: it contains several learning algorithms extending the set of algorithms provided in the kelp-core project.
  4. kelp-full: this is the complete package of KeLP. It aggregates the previous modules in one jar. It contains also a set of fully functioning examples showing how to implement a learning system with KeLP. Batch learning algorithm as well as Online Learning algorithms usage is shown here. Different examples cover the usage of standard kernels, Tree Kernels and Graph Kernels, with caching mechanisms.

Moreover, this new release includes consistency check methods during the population of a Dataset object and:

CsvDatasetReader: it allows to read files in CSV format

DCDLearningAlgorithm: it is the implementation of the Dual Coordinate Descent learning algorithm.

Check out this new version from our repositories. API Javadoc is already available. Your suggestions will be very precious for us, so download and try KeLP 2.0.0!