KeLP supports natively a multiple representation formalism. It is useful, for example, when the same data can be represented by different observable properties. For example, in NLP one can decide to derive features of a sentence for different syntactic levels (e.g. part-of-speech, chunk, dependency) and treat them in a learning algorithms with different kernel functions.
As an example, consider the following representation:
1 |
service |BV| _.:1.0 _and:1.0 _good:1.0 _is:1.0 _look:1.0 _sharp:1.0 _staff:1.0 _the:1.0 _they:1.0 _too:1.0 _very:1.0 |EV||BDV| 0.37651452,0.32109955,0.07726285,0.053550426,-0.06682896,-0.20111458,-0.14017934,... |EDV| |BS| The staff is very sharp and they look good too . |ES| |BS| 35820984#608922#3 |ES| |
It is composed by
- label (i.e. the class to be learned, service).
- Sparse vector, whose boundaries are delimited by the special tokens |BV| |EV|; in this example, a bag of word is used. Note that features can be string!
- Dense vector, whose boundaries are delimited by the special tokens |BDV| |EDV|.
- Two String representation, delimited by |BS| |ES|; they can be used both for comments, or for advanced kernel functions (e.g. sequence kernels).
On this representation a multiple kernel learning algorithm can be applied. Let’s look at an example of code:
The first part load a dataset, print some statistics and define the basic objects for our learning procedure.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
// Read a dataset into a trainingSet variable SimpleDataset trainingSet = new SimpleDataset(); trainingSet.populate("src/main/resources/multiplerepresentation/train.dat"); // Read a dataset into a test variable SimpleDataset testSet = new SimpleDataset(); testSet.populate("src/main/resources/multiplerepresentation/test.dat"); // define the positive class StringLabel positiveClass = new StringLabel("food"); // print some statistics System.out.println("Training set statistics"); System.out.print("Examples number "); System.out.println(trainingSet.getNumberOfExamples()); System.out.print("Positive examples "); System.out.println(trainingSet .getNumberOfPositiveExamples(positiveClass)); System.out.print("Negative examples "); System.out.println(trainingSet.getNumberOfNegativeExamples(positiveClass)); System.out.println("Test set statistics"); System.out.print("Examples number "); System.out.println(testSet.getNumberOfExamples()); System.out.print("Positive examples "); System.out.println(testSet.getNumberOfPositiveExamples(positiveClass)); System.out.print("Negative examples "); System.out.println(testSet.getNumberOfNegativeExamples(positiveClass)); // instantiate a passive aggressive algorithm KernelizedPassiveAggressiveClassification kPA = new KernelizedPassiveAggressiveClassification(); // indicate to the learner what is the positive class kPA.setLabel(positiveClass); // set an aggressiveness parameter kPA.setAggressiveness(2f); |
The kernel function is the only that has knowledge about the representation on which it will operate. To use multiple representations, each with a specific kernel function, we must specify for each kernel what representation to use. Note that to have comparable scores with different kernels, we normalize each kernel, by applying a NormalizationKernel.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
// Kernel for the first representation (0-index) Kernel linear = new LinearKernel("0"); // Normalize the linear kernel NormalizationKernel normalizedKernel = new NormalizationKernel(linear); // Apply a Polynomial kernel on the score (normalized) computed by // the linear kernel Kernel polyKernel = new PolynomialKernel(2f, normalizedKernel); // Kernel for the second representation (1-index) Kernel linear1 = new LinearKernel("1"); // Normalize the linear kernel NormalizationKernel normalizedKernel1 = new NormalizationKernel(linear1); // Apply a Polynomial kernel on the score (normalized) computed by // the linear kernel Kernel rbfKernel = new RbfKernel(1f, normalizedKernel1); // tell the algorithm that the kernel we want to use in learning is // the polynomial kernel |
A weighted linear combination of kernel contribution is simply obtained by instantiating a LinearKernelCombination, and by using the add method on it. Finally we set the kernel on the passive aggressive algorithm.
1 2 3 4 5 6 7 8 |
LinearKernelCombination linearCombination = new LinearKernelCombination(); linearCombination.addKernel(1f, polyKernel); linearCombination.addKernel(1f, rbfKernel); // normalize the weights such that their sum is 1 linearCombination.normalizeWeights(); // set the kernel for the PA algorithm kPA.setKernel(linearCombination); |
Then, we learn a prediction function, and we apply it on the test data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
// learn and get the prediction function kPA.learn(trainingSet); Classifier f = kPA.getPredictionFunction(); // classify examples and compute some statistics int correct = 0; for (Example e : testSet.getExamples()) { ClassificationOutput p = f.predict(testSet.getNextExample()); if (p.getScore(positiveClass) > 0 && e.isExampleOf(positiveClass)) correct++; else if (p.getScore(positiveClass) < 0 && !e.isExampleOf(positiveClass)) correct++; } System.out.println("Accuracy: " + ((float) correct / (float) testSet.getNumberOfExamples())); |