Академический Документы
Профессиональный Документы
Культура Документы
Detection of Answer
Strings
Soubbotin and Soubbotin
Essentials of Approach
A certain shift from deep text analysis and
NLP methods to surface techniques
Use of formulas describing the structure of
strings likely bearing certain semantic
information
Example
FBI Director Louis Freeh
A person represented by his/her first/last
names
A person occupies a post in an
organization
The formula
A word composed of capital letters
An item from a list of posts in an
organization
An item from a list of first names
A capitalized word
Patterns
Formulas of such kind were called
patterns
First used at TREC-10 QA track
Each pattern is characterized by a certain
generalized semantics
Steps (Overview)
Identify strings corresponding to a formula
Identify the question terms (types)
Check for expressions negating the
semantics of the found strings
Apply the set of formulas (for a particular
question type) to match the strings in
question-relevant passages
A Surface Approach
No need to distinguish linguistic entities
Formulas for strings look like regular
expressions
But patterns include elements referring to
lists of predefined words/phrases
Where-question:
suggest
Examples
match)
Complex Patterns
Strings expressing relationship between
several semantic entities
The more complex a pattern is, the higher
its reliability
People Names
Items
Dates
Prepositions,
How question words are located in the patternmatching string (distance, left/right, position to
other matching strings etc)
Simplicity of a patterns structure is
compensated by complexity of rules
Without applying heuristic rules, sufficiently
reliable results cannot be ensured
Rank assigned to question words/phrases and
score assigned to candidate answers
QA Process
Analysis of Results
TREC 2002:
confidence-weighted
score = 0.691
271 right answers, 209 wrong answers, 148
no answer
First 29 correct answers belonged to question
types with highly reliable patterns
Incorrectly identified answer strings = 13.6%
(excluding NIL answers)