Language is the most powerful media for acquiring, communicating and sharing knowledge.
It has been optimized through centuries of use, successes and failures. Although theWeb of Data claims for machine readable standards, natural language is still the preferred query language for naive users, early adopters and even experts in some knowledge domains. Question answering is thus the crucial bottleneck for a truly and universal adoption of Open Linked Data as a knowledge sharing paradigm and practice.
In (Giannone et al., 2013), we presented a novel contribution to the above research which combines symbolic reasoning over the semantic constraints on an underlying Open Data repository and methods of statistical inference over linguistic data. This latter is useful to manage the ambiguity introduced by natural language within a complex process for the interpretation of the question, that integrates distributional semantic models of lexical information and probabilistic inference.
In this way, we aim at solving at least two problems. The first one is the localization and retrieval of ontological elements evoked by a question without relying on strict hypothesis on the resource vocabulary. Second, we jointly solve the different ambiguities arising in the interpretation by integrating question grammatical structures and ontology information. The idea is to map the different inferences into a generative graphical model, i.e. an Hidden Markov Model of the question. This works as a bridge between the linguistic and the RDF structures, i.e the syntactic dependency graph of the question on the one side and the full paths in the RDF graph, on the other.
(Giannone et al., 2013), Cristina Giannone, Valentina Bellomaria and Roberto Basili, A HMM-based Approach to Question Answering against Linked Data, CLEF 2013 Evaluation Labs and Workshop, Online Working Notes, 23-26 September, Valencia, Spain, 2013, Eds. Pamela Forner, Roberto Navigli, Dan Tufis, ISBN 978-88-904810-5-5, ISSN 2038-4963.