Universita' di Roma Tor Vergata, Universita' di Pisa,
Istituto di Linguistica Computazionale,
Fondazione Bruno Kessler, Universita' di Trento
In the "Frame Labeling over Italian Texts" (FLaIT) evaluation exercise systems have to detect the semantic frame "evoked" by a predicate and the major semantic roles explicitly mentioned in an Italian sentence, according to the frame semantics paradigm of (Fillmore, 1985). In particular, the task consists in recognizing words and phrases that evoke semantic frames of the sort defined in the FrameNet project (Baker et al., 1998, http://framenet.icsi.berkeley.edu), and their semantic dependents, which are usually, but not always, their syntactic dependents.
We will refer to this problem as Semantic Role Labeling (SRL). As in previous SRL shared tasks (e.g. CoNLL-2004 and CoNLL-2005), the general goal is to come forward with representation models, inductive algorithms and inference methods which address the proposed SRL problem.
Previous experiences (as in CoNLL-2004/2005 or Semeval 2007 (Baker et al., 2007)) were focused on developing SRL systems based on partial parsing information and/or increasing the amount of syntactic and semantic input information, aiming to boost the performance of machine learning systems on the SRL task. Accordingly, the Evalita 2011 FLaIT challenge will concentrate on the definition of different tasks, focusing on different aspects of the SRL problem.
We encourage the adoption of basic resources for Italian that are under development in the iFrame project. These resources will be made available to all groups participating to the FLaIT task
Interested groups that may not rely on proprietary parsing technologies will be supported in their participation as syntactic annotations for the training and test data at the morphological and syntactic level (at least lemmas, POS tags and Named Entities are expected) will be made available. The level of quality of these auxiliary information may not be homogeneous, as no full manual validation in the released material is expected for the 2011 EvalIta edition.
The reference semantic system is Framenet, presented and widely discussed at the Framenet Project Home Page.
Please visit Evalita 2011 homepage where information about general organization and all tasks is available.
Frame Prediction
In the first subtask, we want to measure the accuracy in detecting the correct frame of a sentence given the presence of a possibly ambiguous lexical unit. In this case we are interested to verify if the system is able to recognize the occurrence of a predicate word as the lexical unit of its corresponding frame, and to verify if ambiguous lexical units are properly disambiguated within sentences.
Semantic Role Labeling: Argument Detection and Classification.
In the Semantic Role Labeling task, participants will be asked to locate and annotate all the semantic arguments of a frame that are explicitly realized in a sentence, given the corresponding lexical unit. This task correspond to the traditional Semantic Role labeling challenge as defined by the CoNLL 2005 task.
Evaluation Metrics
The evaluation metrics will be:
The evaluation will be carried out by the organizers. Participants are required to submit the annotations for the test data and provide a brief description of their system and a full notebook paper describing their experiments, in particular the techniques and the resources used. An analysis of their results will be made possible as the test reference data will be made available after the publication of the official results.
The training corpus is distributed in two separated sets. The first set, hereafter FBK set has been developed at the Fondazione Bruno Kessler (Tonelli and Pianta, 2008). It includes annotation at the syntactic and semantic level uunder the XML Tiger format also used by the Salsa project. The reference syntactic formalism of the is a constituency-based formalism obtained as output from the constituency-based parser by Corazza et al. (2007). The second set, hereafter ILC set has been developed at the ILC in Pisa by Alessandro Lenci and his colleagues [6]. It also includes annotation at the syntactic and semantic level under the XML Tiger format also used by the Salsa project. The grammatical formalism adopted in the development of the ILC set is a dependency based one, based on the TANL parser.
The creation of data was initiated by different groups that made them available to the iFrame (Italian Framenet) project, a collaboration between the University of Pisa, the University of Roma "Tor Vergata", the University of Trento, the Fondazione Bruno Kessler and CELI.
The description of file <FBK>_XXX.src, produced in Trento, and <ILC>_XXX.src, annotated in Pisa, will be soon documented here.
Both resources are based on the XML format used by different Framenet projects worldwide, including the SALSA project in Saarbrucken, where the SALTO annotation tool based on the Tiger format are adopted.
The so-called semantic data file include a CoNLL-like tabular format that expresses all the annotations of the semantic layer, i.e. frames and frame elements for every predicate and role in the sentence. Multiple frames are possible for one sentence. One single tabular description per sentence is made available. Semantic annotations corresponding to the XXX.src file are delivered in a single XXX.sem file. The semantic file is also the format in which participants are expected to deliver their output, i.e. labeled sentences.
The following table describes the columns as they are found in the semantic annotation file .semField Name | Description |
---|---|
Tok Counter | Token counter, that is incremental for each sentence |
Form | Word form or punctuation symbol, w |
PoS | Part-of-speech tag, with morphological features, based on the TANL tagset. |
Frame tag | The Frame F of the word w, if w is a lexical unit for F. |
Frame Element tags for the first (top-down) entry F1 in the column "Frame tag" | The Frame Element FE of the word w, if w belongs to the semantic argument of first frame of the column "Frame tag", whose type is FE. |
... | |
Frame Element tags for the k-th (top-down) entry Fk in the column "Frame tag" | The Frame Element FE of the word w, if w belongs to the semantic argument of the frame Fk of the column "Frame tag", whose type is FE. |
... |
1 Rilevata V - - - - 2 la RD - - - - 3 presenza S Presence Target - - 4 di E - Entity - - 5 gas S - Entity - - 6 in E - Location - - 7 uno PI - Location - - 8 dei EA - Location - - 9 tubi S - Location - - 10 trasparenti A - Location - - 11 che PR - Location - - 12 compongono V - Location - - 13 l' RD - Location - - 14 opera S - Location - - 15 , FF - - - - 16 i RD - - - - 17 guardiani S - - - - 18 hanno VA - - - - 19 fatto V - - - - 20 scattare V Process_start - Target - 21 uno RI - - Event - 22 speciale A - - Event - 23 piano S - - Event - 24 d' E - - Event - 25 emergenza S - - Event - 26 e CC - - - - 27 per E - - - Duration 28 45 N - - - Duration 29 minuti S - - - Duration 30 i RD - - - Agent 31 pompieri S - - - Agent 32 hanno VA - - - - 33 isolato V Closure - - Target 34 la RD - - - Containing_object 35 sala S - - - Containing_object 36 . FS - - - -
The so-called syntactic data file includes a CoNLL-like tabular format that expresses all the annotations of the syntactic layer, i.e. lemmas, POS tags and dependencies, as they are output by the TANL parser. Syntactic annotations allow teams that are using an owned parser to access meaningful syntactic information for each sentence. Syntactic files corresponding to the sentences in the XXX.src file are delivered in a single XXX.synt file. While the original annotations are hand-validated for the entire training set, it is worth noticing that grammatical information defined in the XXX.synt file are NOT validated, and can be possibly noisy.
The following table describes the columns as they are found in the syntactic annotation file .syntField Name | Description |
---|---|
Tok Counter | Token counter, that is incremental for each sentence |
Form | Word form or punctuation symbol, w |
Lemma | Word lemma or punctuation symbol |
PoS | Part-of-speech tag, with morphological features, based on the TANL tagset. |
Head | Head of the current token w, which is either a valid Token counter or zero ('0'). Note that depending on the original treebank annotation, there may be multiple tokens with an ID of zero. |
Dependency label | The dependency relation to the HEAD triggered by w in the sentence |
1 Rilevata rilevare V 0 ROOT 2 la il RD 3 det 3 presenza presenza S 20 subj 4 di di E 3 comp 5 gas gas S 4 prep 6 in in E 20 comp 7 uno uno PI 6 prep 8 dei di EA 7 comp 9 tubi tubo S 8 prep 10 trasparenti trasparente A 9 mod 11 che che PR 12 subj 12 compongono comporre V 9 mod_rel 13 l' il RD 14 det 14 opera opera S 12 obj 15 , , FF 14 con 16 i il RD 17 det 17 guardiani guardiano S 20 subj 18 hanno avere VA 19 aux 19 fatto fare VM 20 modal 20 scattare scattare V 1 arg 21 uno uno RI 23 det 22 speciale speciale A 23 mod 23 piano piano S 20 obj 24 d' di E 23 comp 25 emergenza emergenza S 24 prep 26 e e CC 20 con 27 per per E 33 comp_temp 28 45 @card@ N 29 mod 29 minuti minuto S 27 prep 30 i il RD 31 det 31 pompieri pompiere S 33 subj 32 hanno avere VA 33 aux 33 isolato isolare V 20 conj 34 la il RD 35 det 35 sala sala S 33 obj 36 . . FS 1 punc
The description of the Tanl tagset used for the morpho-syntactic annotation of the sentences (Files: .synt) are described HERE. All tags of the semantic layer make reference to Framenet.
Early Training (Dry-Run) Corpus:
Download (1st version)
Training Corpus (Second Version):
Download
First Run Test Corpus:
Download
Second Run Test Corpus:
The Second Run will be automatically sent to participants after the submission of the First Run data
For participants that have not performed the First Run please send an e-mail to Diego De Cao
Tasks
The task on which all systems will participate are Frame Prediction, Argument Detection and Classification, as described above.
Frame Detection. In the Test Corpus a significant number of sentences includes a predicate word that is ambiguous across two or more frames. We remind here that all these ambiguous frames are represented in the training data sets, made available to participants. No unseen frame for any predicate word targeted in the test corpus is allowed. The Frame detection task makes thus a sort of "closed" world assumption whereas the dictionary of lexical units for each frame is confined to the ones already exemplified in the training data sets.
The Argument Detection (also known as Boundary Recognition) task will be measured in two different conditions, i.e. with and without the information about the correct Frame of the targeted sentence. This will allow to factorize out the errors due to previous mistakes introduced by the individual systems.
Analogously, the Argument Classification task will be also evaluated in three different conditions, i.e. with no information, with information about the exact frame per predicate word and with information about the exact frame and argument boundaries. At this purpose we will proceed to deliver test data in three different stages that will characterize three different runs, as hereafter discussed.
Releases of Data for Testing
A corpus of test sentences (hereafter Test Data set) will be released in three different stages or runs. Every test run aims at evaluating the ability of the participant systems in all the three tasks, i.e. Frame Prediction, Argument Detection and Classification. As a consequence, the output of every Test Run follows the same format, i.e. the Semantic Annotation format (described in the SemA section) with information about frames and roles described in consecutive and inter-dependent columns.
Input syntactic information is provided in order to facilitate participants that cannot apply their own parsing technology. However, syntactic information as provided in the test data files may include parsing errors.
First Test Run. In the first run the test sentences will be released without any semantic information except the marking of triggering predicate word. As for the presence of possibly ambiguous lexical units (as already observed in the training set) the Frame Prediction stage is a necessary step in this first run. In this stage, the sentences will be made available with an explicit marking of the predicate word (i.e. the lexical unit). The file will be released as a Semantic Annotation File (see the SemAnn section, above) where the fourth column concerning the Frame label will be leave blank and the other columns explicitly mark the predicate word (with the fixed, i.e. frame independent, label Target) without any annotations for roles.
Second Test Run. In the second run the test sentences will be released with the explicit information about the correct frame corresponding to the marked predicate word. Although the Frame Prediction stage is no longer a necessary step in this second run, the output format must not be changed with respect to the first test run. This will facilitate the automatic evaluation process. In the second test run stage, the sentences will be made available with an explicit marking of the frame corresponding to a predicate word (i.e. the lexical unit). The file will be thus released as a Semantic Annotation File (see the SemAnn section, above) where the fourth column explicitly marks the frame of the predicate word (i.e. the frame label itself, e.g. Judgment_communication), and columns used for semantic annotations (i.e. predicates and roles) explicitly marks the predicate word (with the fixed label Target) without any annotations for roles.
Third Test Run. In the third run the test sentences will be released with the explicit information about (19 the correct frame corresponding to the marked predicate word, as well as (2) the marked boundaries of individual arguments. The output format of the third run is the same of the previous two test runs. This will facilitate the automatic evaluation process. In the third test run stage, the file will be thus released as a Semantic Annotation File (see the SemAnn section, above). Frames will explicitly mark the predicate word(s) (i.e. frame labels, e.g. Process_start), and the following columns will describe arguments in the BIO notation: different columns correspond to possibly multiple predicate words per sentence, as in the example below:
1 Rilevata V - - - - 2 la RD - - - - 3 presenza S Presence Target - - 4 di E - B - - 5 gas S - O - - 6 in E - B - - 7 uno PI - I - - 8 dei EA - I - - 9 tubi S - I - 10 trasparenti A - I - - 11 che PR - I - - 12 compongono V - I - 13 l' RD - I - - 14 opera S - O - - 15 , FF - - - - 16 i RD - - - - 17 guardiani S - - - - 18 hanno VA - - - - 19 fatto V - - - - 20 scattare V Process_start - Target - 21 uno RI - - B - 22 speciale A - - I - 23 piano S - - I - 24 d' E - - I - 25 emergenza S - - O - 26 e CC - - - - 27 per E - - - B 28 45 N - - - I 29 minuti S - - - O 30 i RD - - - B 31 pompieri S - - - O 32 hanno VA - - - - 33 isolato V Closure - - Target 34 la RD - - - B 35 sala S - - - O 36 . FS - - - -
Submission Procedures
Test Runs are indexed from 1 (the first) to 3 (the third one). Each team is allowed to submit a maximum number of 2 systems, possibly deriving from different configurations, parametrizations or resources. Results for each run must be made available to the organizers address, EvalIta-FLaIT, as files that have the same format as the Training Corpus (file .sem). The different files must be named as follows:
<Team>_FLaIT_<Run>_<System>.sem
Organizers
Roberto Basili (University of Roma, Tor Vergata),
Alessandro Lenci (University of Pisa)
Steering Committe:
Alessandro Moschitti (University of Trento),
Sara Tonelli (Fondazione Bruno Kessler),
Diego De Cao (University of Roma, Tor Vergata),
Giulia Venturi (ILC-CNR & Scuola Superiore "S. Anna", Pisa),
Giampaolo Mazzini (CELI, Torino)
Address: