In machine learning, kernel methods are a class of algorithms for pattern analysis, in which the objective is to find general types of relations in datasets. In particular, kernel methods leverage on the so called kernel functions, which enable them to operate in an implicit high-dimensional feature space without explicitly computing the coordinates of the data in that space. The kernel function is the responsible of computing the similarity between examples without mapping the data in the higher-dimensional space. This operation is often cheaper from a computational perspective and it is called the kernel trick. Kernel functions have been introduced for sequence data, graphs, trees, text, images, as well as vectors. One of the most appealing characteristics of kernels is that they can be composed and combined in order to create richer similarity metrics in which different information from different Representations can be simultaneously exploited.
KeLP supports this composition, providing three abstractions of the Kernel class:
- DirectKernel: it operates directly on a specific Representation that is automatically extracted from the Examples to be compared.
- KernelComposition: it enriches the kernel similarity provided by any another Kernel, e.g. PolynomialKernel,
- KernelCombination: it allows the combination of different Kernels in a specific function, e.g. LinearKernelCombination which applies a weighted linear combination.
In describing how to implement a new Kernel we will use the a linear kernel and a radial basis function (RBF) kernel as practical examples.
IMPLEMENTING A DIRECT KERNEL: THE LINEAR KERNEL EXAMPLE
The linear kernel is simply the dot product between two vectors \(x_i\) and \(x_j\), i.e. \(k(x_i,x_j) = x_i·x_j\). Then, in implementing LinearKernel, we need to extend a DirectKernel over Vector. Optionally the class can be annotated with @JsonTypeName in order to specify an alternative type name to be used during the serialization/deserialization mechanism.
1 2 |
@JsonTypeName("linear") public class LinearKernel extends DirectKernel { |
To make the JSON serialization/deserialization mechanism work, an empty constructor must be defined:
1 2 3 |
public LinearKernel() { } |
Finally the kernelComputation method must be implemented:
1 2 3 4 |
@Override protected float kernelComputation(Vector repA, Vector repB) { return repA.innerProduct(repB); } |
IMPLEMENTING A KERNEL COMPOSITION: THE RBF KERNEL EXAMPLE
The RBF kernel function corresponds to \(k(x_i,x_j)=e^{-\gamma \left\lVert x_i-x_j \right\rVert_\mathcal{H} ^2}\) where \(\left\lVert x_i-x_j \right\rVert_\mathcal{H}\) is the distance between two examples \(x_i\) and \(x_j\) in a Reproducing Kernel Hilbert Space \(\mathcal{H}\). Considering a generic RKHS, instead of a simple euclidean distance in \(\mathbb{R}^n\), it is possible to generalize such operation and make it applicable to any underlying kernel, operating on any representation. Then RbfKernel must extend KernelComposition. Optionally the class can be annotated with @JsonTypeName in order to specify an alternative type name to be used during the serialization/deserialization mechanism.
1 2 |
@JsonTypeName("rbf") public class RbfKernel extends KernelComposition { |
To make the JSON serialization/deserialization mechanism work, an empty constructor must be defined and all the kernel parameters must be associated to the corresponding getter and setter methods. In this case gamma is the only parameter to be serialized and the corresponding getGamma and setGamma methods must be implemented.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
private float gamma; public RbfKernel() { } /** * @return the gamma */ public float getGamma() { return gamma; } /** * @param gamma * the gamma to set */ public void setGamma(float gamma) { this.gamma = gamma; } |
Finally the kernelComputation method must be implemented containing the specific kernel similarity logic.
1 2 3 4 5 6 |
@Override protected float kernelComputation(Example exA, Example exB) { float innerProductOfTheDiff = this.baseKernel .squaredNormOfTheDifference(exA, exB); return (float) Math.exp(-gamma * innerProductOfTheDiff); } |