Content Processing and Acquisition

The robustness recently achieved by NLP technologies makes their applicability very promising for the support in the design of advanced system development. Human Language Technologies (HLT) for Content Acquisition favor the incremental design of unstructured data processing systems through reuse. HLTs are crucial for robust and accurate analysis of unstructured text, and for enriching them with semantic meta-data or other implict information. It allows to extract interesting semantic phenomena and to map them into structured representation of a target domain.

When a semantic meta-model is available, for example in form of an existing ontology, HLT allows to locate concepts in the text (irrespectively from the variable forms in which they appear in the free text), mark them according to Knowledge Representation Languages (such as RDF or OWL) thus unifying different shallow representations of the same concepts. In this way semantic annotations of concepts in the text are obtained for the original document, making it more suitable for clustering, retrieval and browsing activities. In synthesis, HLT enables to perform and simplify advanced functionalities (e.g. semantic search) that are possible over the text.

In the following the pages of this section are shown: