Академический Документы
Профессиональный Документы
Культура Документы
Semantic Answer Validation in Question Answering Systems for Reading Comprehension Tests
7th Alberto Mendelzon International Workshop on Foundations of Data Management
MAY 21 - 23, 2013
Authors: Helena Gmez Adorno David Pinto Avendao Darnes Vilario Ayala
Outline
2
Introduction Proposed
System Architecture
Document
The problem
3
Annie Lennox Why I am an HIVAIDS activist. I'm going to share with you the story as to how I have become an HIV/AIDS campaigner. And this is the name of my campaign, SING Campaign. In November of 2003 I was invited to take part in the launch of Nelson Mandela's 46664 Foundation. That is his HIV/AIDS foundation. And 46664 is the number that Mandela had when he was imprisoned in Robben Island.
Who is the founder of the SING campaign? 1) Nelson Mandela 2) Youssou N'Dour 3) Michel Sidibe 4) Zackie Achmat 5) Annie Lennox
System Architecture
4
Document Processing
5
Perform anaphora resolution for the documents using the JavaRAP system.
Step 1
Identify the author of the document. Which is usually the first name in the document (NNP tag) The Stanford POS tagger was used
Step 2
Each personal pronoun in the first person of the set PRP={ "I", "me", "my", "myself } Generally refers to the author.
Step 3
Replace each term of the document that is in the PRP , by the document author name identified in step 1
Given the following text: Emily Oster flips our thinking on AIDS in Africa. So I want to talk to you today about AIDS in sub-Saharan Africa. I imagine you all know something about AIDS
Step 1: Emily_NNP Oster_NNP flips_VBZ our_PRP$ thinking_NN on_IN AIDS_NNP in_IN Africa. In this case, the 2 first terms that have the NNP label are selected to identify the author. Autor = Emily Oster Step 2: So_NNP I_PRP want_VBP to_TO talk_VB to_TO you_PRP today_NN about_IN AIDS_NNP in_IN sub-Saharan_NNP Africa_NNP ._. I_PRP imagine_VBP you_PRP all_DT know_VBP something_NN about_IN AIDS_NNP ._. Here are identified 2 labels that belong to the PRP set.
Step 3: So Emily Oster want to talk to you today about AIDS in sub-Saharan Africa. Emily Oster imagine you all know something about AIDS. The words of the PRP set are replaced by the author of the document.
System Architecture
7
Hypothesis Generation
8
A Part-Of-Speech (POS) tagger is applied in order to identify the question keywords (what, where, when, who, etc.). Afterwards those words are replaced by each of the five possible answers, thereby obtaining five hypotheses for each question
Question: Who is the founder of the SING campaign? Answer 1: Nelson Mandela . . . Answer 5: Annie Lennox From the previous question and their possible answers, the following hypotheses are obtained: Hipothesis 1: Nelson Mandela is the founder of the SING campaign . . . Hipothesis 5: Annie Lennox is the founder of the SING campaign
Information Retrieval
9
Was built using the Lucene IR library. Responsible for indexing the document collection and for the further passage retrieval, given an hypothesis as a query. Returns a relevant passage for each hypothesis. This passages are later used as a support text to decide if hypothesis may be the right answer.
H1 Nelson Mandela founder SING campaign Everyone reveres Nelson Mandela . H2 Youssou N'Dour founder SING campaign So this is Annie Lennox SING Campaign . Annie Lennox 'm sitting here in New York H3 Michel Sidibe founder SING campaign with Michel Sidibe . Annie Lennox met Zackie Achmat , the founder of Treatment Action Campaign , an incredible campaigner and activist at a H4 Zackie Achmat founder SING campaign 46664 event . H5 Annie Lennox founder SING campaign So this is Annie Lennox SING Campaign .
System Architecture
10
Semantic Similarity
11
Given the pair (H, T), a semantic similarity score is calculated. The similarity measure proposed by Rada Mihalcea gives a weight to each word of the sentence in terms of the degree of specificity of the word.
Hipothesis (H) : Then perhaps we could have avoided a catastrophe. Support Text (T) : Perhaps we should have been able to prevent a disaster. Similarity: 4.500
1 , = 2
maxSim(w1,w2) Similarity
12
PMI-IR: It is based on statistical data collected by an information retrieval engine over a very large corpus (i.e. the web). Given two words w1 y w2, its PMI-IR is calculated:
1 ,2
1 &2 = 2 1 2
Path_similarity : Return a score denoting how similar two word senses are, based on the shortest path that connects the senses in the is-a (hypernym/hypnoym) taxonomy of Wordnet.
Answer Selection
13
5.84
Zackie Achmat founder SING campaign Annie Lennox founder SING campaign
Annie Lennox 'm sitting here in New York with Michel Sidibe . 0.569 2.16 1.86 5.296 Annie Lennox met Zackie Achmat , the founder of Treatment Action Campaign , an incredible campaigner and activist at a 46664 event . 1.361 2.48 2.08 6.633
So this is Annie Lennox SING Campaign . 1.530 2.54 2.54 7.359
Test Corpus
14
Provided
3 Topics
12 Reading tests
Documents and questions are available in English German, Italian, Romanian and Spanish.
Background Corpus
15
Obtained Results
16
Similarity Score PATH (wordnet) PMI-IR Lucene Score PATH + PMI PMI + Lucene Score
13%
NCA 32 36 39 33 41
NoA 0 0 0 0 0
NIA 88 84 81 87 79
Precision 27 % 30 % 33 % 28 % 34 %
33 %
Lucene
No. Of Correct Answer Approach Quantity Both None 64 15 Lucene + PMI Lucene Score Both Aproaches 24 None PMI + Lucene Score 17
Conclusions
17
I have presented a system using two semantic similarity measures: PMI and PATH similarity. We have perform an experimental comparison of the methodology using semantic and lexical similarity measures. We have observed that the semantic similarity measures are able to discover answers that with the lexical similarity measure could not be discovered.
Future Work
18
As future work we would like to determine which question is more suitable to be validated by a semantic measure, and which one is better to be validated with a lexical measure. Making this process automatic will improve the overall precision of the methodology. New methodologies are being implemented without the use of Information retrieval systems, based on graph representation of the knowledge (syntactic trees, semantic networks).
Thank you!
Obtained Results
20
@1 =
Donde: = Nmero de preguntas
1 (
2011
0.57 0.47 0.37 0.34 0.20 0.02
Lucene combina el modelo booleano along with the vector space model. The vector space model uses the cosine similarity to rank the documents. Lucene refines this score with the equation: