Informa (On Retrieval: Recap of The Previous Lecture

Introduc)on to Informa)on Retrieval  Introduc)on to Informa)on Retrieval  Ch.
Recap of the previous lecture 
  Basic inverted indexes: 
Introduc)on to    Structure: Dic)onary and Pos)ngs 
Informa(on Retrieval 
CS276: Informa)on Retrieval and Web Search    Key step in construc)on: Sor)ng 
Christopher Manning and Prabhakar Raghavan    Boolean query processing 
Lecture 2: The term vocabulary and pos)ngs    Intersec)on by linear )me “merging” 
lists    Simple op)miza)ons 
  Overview of course topics 
Introduc)on to Informa)on Retrieval  Introduc)on to Informa)on Retrieval 
Plan for this lecture  Recall the basic indexing pipeline 
Elaborate basic indexing  Documents to Friends, Romans, countrymen.
be indexed.
  Preprocessing to form the term vocabulary 
  Documents  Tokenizer
  Tokeniza)on  Token stream. Friends Romans Countrymen
  What terms do we put in the index?  Linguistic
  Pos)ngs  modules
Modified tokens. friend roman countryman
  Faster merges: skip lists 
Indexer friend  2 4
  Posi)onal pos)ngs and phrase queries 
roman  1 2
Inverted index.
countryman  13 16
Introduc)on to Informa)on Retrieval  Sec. 2.1 Introduc)on to Informa)on Retrieval  Sec. 2.1
Parsing a document  Complica)ons: Format/language 
  What format is it in?    Documents being indexed can include docs from 
many different languages 
  pdf/word/excel/html? 
  A single index may have to contain terms of several 
  What language is it in?  languages. 
  What character set is in use?    Some)mes a document or its components can 
contain mul)ple languages/formats 
  French email with a German pdf aWachment. 
Each of these is a classification problem,   What is a unit document? 
which we will study later in the course.   A file? 
  An email?  (Perhaps one of many in an mbox.) 
But these tasks are often done heuristically …   An email with 5 aWachments? 
  A group of files (PPT or LaTeX as HTML pages) 
1
Introduc)on to Informa)on Retrieval  Introduc)on to Informa)on Retrieval  Sec. 2.2.1
Tokeniza)on 
  Input: “Friends, Romans and Countrymen” 
  Output: Tokens 
  Friends 
  Romans 
  Countrymen 
  A token is an instance of a sequence of characters 
TOKENS AND TERMS    Each such token is now a candidate for an index 
entry, a_er further processing 
  Described below 
  But what are valid tokens to emit? 
Introduc)on to Informa)on Retrieval  Sec. 2.2.1 Introduc)on to Informa)on Retrieval  Sec. 2.2.1
Tokeniza)on  Numbers 
  Issues in tokeniza)on:    3/20/91       Mar. 12, 1991       20/3/91 
  Finland’s capital →     55 B.C. 
  B‐52 
     Finland? Finlands? Finland’s? 
  My PGP key is 324a3df234cb23e 
  Hewle:‐Packard → Hewle: and Packard as two    (800) 234‐2333 
tokens?    O_en have embedded spaces 
  state‐of‐the‐art: break up hyphenated sequence.   
  Older IR systems may not index numbers 
  co‐educa?on 
  But o_en very useful: think about things like looking up error 
  lowercase, lower‐case, lower case ? 
codes/stacktraces on the web 
  It can be effec)ve to get the user to put in possible hyphens 
  (One answer is using n‐grams: Lecture 3) 
  San Francisco: one token or two?      Will o_en index “meta‐data” separately 
  How do you decide it is one token?    Crea)on date, format, etc. 
Tokeniza)on: language issues  Tokeniza)on: language issues 
  French    Chinese and Japanese have no spaces between 
  L'ensemble → one token or two?  words: 
  L ? L’ ? Le ?    莎拉波娃现在居住在美国东南部的佛罗里达。
  Want l’ensemble to match with un ensemble 
  Not always guaranteed a unique tokeniza)on  
  Un)l at least 2003, it didn’t on Google 
  Interna)onaliza)on!    Further complicated in Japanese, with mul)ple 
alphabets intermingled 
  German noun compounds are not segmented    Dates/amounts in mul)ple formats 
  LebensversicherungsgesellschaUsangestellter  フォーチュン500社は情報不足のため時間あた$500K(約6,000万円)
  ‘life insurance company employee’ 
  German retrieval systems benefit greatly from a compound spli>er  Katakana Hiragana Kanji Romaji
module 
  Can give a 15% performance boost for German   End-user can express query entirely in hiragana!
2
Tokeniza)on: language issues  Stop words 
  Arabic (or Hebrew) is basically wriWen right to le_,    With a stop list, you exclude from the dic)onary 
but with certain items like numbers wriWen le_ to  en)rely the commonest words. Intui)on: 
right    They have liWle seman)c content: the, a, and, to, be 
  Words are separated, but leWer forms within a word    There are a lot of them: ~30% of pos)ngs for top 30 words 
form complex ligatures    But the trend is away from doing this: 
  Good compression techniques (lecture 5) means the space for 
including stopwords in a system is very small 
                                ←  →    ← →                         ← start    Good query op)miza)on techniques (lecture 7) mean you pay liWle 
at query )me for including stop words. 
  ‘Algeria achieved its independence in 1962 a_er 132 
  You need them for: 
years of French occupa)on.’    Phrase queries: “King of Denmark” 
  With Unicode, the surface presenta)on is complex, but the    Various song )tles, etc.: “Let it be”, “To be or not to be” 
stored form is  straighnorward    “Rela)onal” queries: “flights to London” 
Normaliza)on to terms  Normaliza)on: other languages 
  We need to “normalize” words in indexed text as well    Accents: e.g., French résumé vs. resume. 
as query words into the same form    Umlauts: e.g., German: Tuebingen vs. Tübingen 
  We want to match U.S.A. and USA    Should be equivalent 
  Result is terms: a term is a (normalized) word type,    Most important criterion: 
which is an entry in our IR system dic)onary    How are your users like to write their queries for these 
  We most commonly implicitly define equivalence  words? 
classes of terms by, e.g.,  
  dele)ng periods to form a term    Even in languages that standardly have accents, users 
  U.S.A., USA    USA  o_en may not type them 
  dele)ng hyphens to form a term    O_en best to normalize to a de‐accented term 
  an?‐discriminatory, an?discriminatory    an?discriminatory    Tuebingen, Tübingen, Tubingen  Tubingen 
Normaliza)on: other languages  Case folding 
  Normaliza)on of things like date forms    Reduce all leWers to lower case 
  7月30日 vs. 7/30   excep)on: upper case in mid‐sentence? 
  Japanese use of kana vs. Chinese characters    e.g., General Motors 
  Fed vs. fed 
  SAIL vs. sail 
  Tokeniza)on and normaliza)on may depend on the    O_en best to lower case everything, since 
language and so is intertwined with language  users will use lowercase regardless of 
detec)on  Is this
‘correct’ capitaliza)on… 
Morgen will ich in MIT … German “mit”?   Google example: 
  Crucial: Need to “normalize” indexed text as well as    Query C.A.T.   
query terms into the same form    #1 result is for “cat” (well, Lolcats) not 
Caterpillar Inc. 
3
Introduc)on to Informa)on Retrieval  Sec. 2.2.3 Introduc)on to Informa)on Retrieval 
Normaliza)on to terms  Thesauri and soundex 
  Do we handle synonyms and homonyms? 
  An alterna)ve to equivalence classing is to do    E.g., by hand‐constructed equivalence classes 
  car = automobile   color = colour 
asymmetric expansion    We can rewrite to form equivalence‐class terms 
  An example of where this may be useful    When the document contains automobile, index it under car‐
  Enter: window   Search: window, windows  automobile (and vice‐versa) 
  Enter: windows  Search: Windows, windows, window    Or we can expand a query 
  Enter: Windows  Search: Windows    When the query contains automobile, look under car as well 
  Poten)ally more powerful, but less efficient    What about spelling mistakes? 
  One approach is soundex, which forms equivalence classes 
of words based on phone)c heuris)cs 
  More in lectures 3 and 9 
Lemma)za)on  Stemming 
  Reduce inflec)onal/variant forms to base form    Reduce terms to their “roots” before indexing 
  E.g.,    “Stemming” suggest crude affix chopping 
  am, are, is → be    language dependent 
  car, cars, car's, cars' → car    e.g., automate(s), automa?c, automa?on all reduced to 
automat. 
  the boy's cars are different colors → the boy car be 
different color 
  Lemma)za)on implies doing “proper” reduc)on to  for example compressed for exampl compress and
dic)onary headword form  and compression are both compress ar both accept
accepted as equivalent to as equival to compress
compress.
Porter’s algorithm  Typical rules in Porter 
  Commonest algorithm for stemming English    sses → ss 
  Results suggest it’s at least as good as other stemming    ies → i 
op)ons 
  a)onal → ate 
  Conven)ons + 5 phases of reduc)ons 
  )onal → )on 
  phases applied sequen)ally 
  each phase consists of a set of commands 
  sample conven)on: Of the rules in a compound command,     Weight of word sensi)ve rules 
select the one that applies to the longest suffix.      (m>1) EMENT → 
  replacement → replac 
  cement  → cement 
4
Other stemmers  Language‐specificity 
  Other stemmers exist, e.g., Lovins stemmer     Many of the above features embody transforma)ons 
 
  hWp://www.comp.lancs.ac.uk/compu)ng/research/stemming/general/lovins.htm
that are 
  Single‐pass, longest suffix removal (about 250 rules)    Language‐specific and 
  Full morphological analysis – at most modest    O_en, applica)on‐specific 
benefits for retrieval    These are “plug‐in” addenda to the indexing process 
  Do stemming and other normaliza)ons help?    Both open source and commercial plug‐ins are 
  English: very mixed results. Helps recall for some queries but  available for handling these 
harms precision on others 
  E.g., opera)ve (den)stry) ⇒ oper
  Definitely useful for Spanish, German, Finnish, …
  30% performance gains for Finnish!
Introduc)on to Informa)on Retrieval  Sec. 2.2 Introduc)on to Informa)on Retrieval 
Dic)onary entries – first cut 
ensemble.french
時間.japanese
MIT.english These may be

grouped by
mit.german
language (or
guaranteed.english not…).
More on this in
entries.english ranking/query
FASTER POSTINGS MERGES: 
sometimes.english
processing. SKIP POINTERS/SKIP LISTS 
tokenization.english
Augment pos)ngs with skip pointers 
Recall basic merge  (at indexing )me) 
  Walk through the two pos)ngs simultaneously, in  41 128
)me linear in the total number of pos)ngs entries  2 4 8 41 48 64 128
11 31
2 4 8 41 48 64 128 Brutus 1 2 3 8 11 17 21 31
2 8
1 2 3 8 11 17 21 31 Caesar   Why? 
  To skip pos)ngs that will not figure in the search 
If the list lengths are m and n, the merge takes O(m+n)
operations. results. 
  How? 
Can we do better?
Yes (if index isn’t changing too fast).   Where do we place skip pointers? 
5
Query processing with skip pointers  Where do we place skips? 
41 128   Tradeoff: 
2 4 8 41 48 64 128   More skips → shorter skip spans ⇒ more likely to skip.  
But lots of comparisons to skip pointers. 
11 31   Fewer skips → few pointer comparison, but then long skip 
1 2 3 8 11 17 21 31 spans ⇒ few successful skips. 
Suppose we’ve stepped through the lists until we

process 8 on each list. We match it and advance.
We then have 41 and 11 on the lower. 11 is smaller.
But the skip successor of 11 on the lower list is 31, so

we can skip ahead past the intervening postings.
Introduc)on to Informa)on Retrieval  Sec. 2.3 Introduc)on to Informa)on Retrieval 
Placing skips 
  Simple heuris)c: for pos)ngs of length L, use √L 
evenly‐spaced skip pointers. 
  This ignores the distribu)on of query terms. 
  Easy if the index is rela)vely sta)c; harder if L keeps 
changing because of updates. 
  This definitely used to help; with modern hardware it 
may not (Bahle et al. 2002) unless you’re memory‐
PHRASE QUERIES AND POSITIONAL 
based  INDEXES 
  The I/O cost of loading a bigger pos)ngs list can outweigh 
the gains from quicker in memory merging! 
Introduc)on to Informa)on Retrieval  Sec. 2.4 Introduc)on to Informa)on Retrieval  Sec. 2.4.1
Phrase queries  A first aWempt: Biword indexes 
  Want to be able to answer queries such as “stanford    Index every consecu)ve pair of terms in the text as a 
university” – as a phrase  phrase 
  Thus the sentence “I went to university at Stanford”    For example the text “Friends, Romans, Countrymen” 
is not a match.   would generate the biwords 
  The concept of phrase queries has proven easily    friends romans 
understood by users; one of the few “advanced search”    romans countrymen 
ideas that works 
  Each of these biwords is now a dic)onary term 
  Many more queries are implicit phrase queries 
  Two‐word phrase query‐processing is now 
  For this, it no longer suffices to store only 
immediate. 
   <term : docs> entries 
6
Longer phrase queries  Extended biwords 
  Longer phrases are processed as we did with wild‐   Parse the indexed text and perform part‐of‐speech‐tagging 
(POST). 
cards: 
  Bucket the terms into (say) Nouns (N) and ar)cles/
  stanford university palo alto can be broken into the  preposi)ons (X). 
Boolean query on biwords:    Call any string of terms of the form NX*N an extended biword. 
stanford university AND university palo AND palo alto    Each such extended biword is now made a term in the 
dic)onary. 
  Example:  catcher in the rye 
Without the docs, we cannot verify that the docs                  N           X   X    N 
matching the above Boolean query do contain the    Query processing: parse it into N’s and X’s 
phrase.    Segment query into enhanced biwords 
  Look up in index: catcher rye 
Can have false positives!
Issues for biword indexes  Solu)on 2: Posi)onal indexes 
  False posi)ves, as noted before    In the pos)ngs, store, for each term the posi)on(s) in 
  Index blowup due to bigger dic)onary  which tokens of it appear: 
  Infeasible for more than biwords, big even for them 
<term, number of docs containing term; 
  Biword indexes are not the standard solu)on (for all  doc1: posi)on1, posi)on2 … ; 
biwords) but can be part of a compound strategy  doc2: posi)on1, posi)on2 … ; 
etc.> 
Posi)onal index example  Processing a phrase query 
  Extract inverted index entries for each dis)nct term: 
<be: 993427; to, be, or, not. 
1: 7, 18, 33, 72, 86, 231; Which of docs 1,2,4,5   Merge their doc:posi)on lists to enumerate all 
2: 3, 149; could contain “to be posi)ons with “to be or not to be”. 
4: 17, 191, 291, 430, 434; or not to be”?   to:  
5: 363, 367, …>
  2:1,17,74,222,551; 4:8,16,190,429,433; 7:13,23,191; ... 
  For phrase queries, we use a merge algorithm    be:   
recursively at the document level    1:17,19; 4:17,191,291,430,434; 5:14,19,101; ... 
  But we now need to deal with more than just    Same general method for proximity searches 
equality 
7
Proximity queries  Posi)onal index size 
  LIMIT! /3 STATUTE /3 FEDERAL /2 TORT     You can compress posi)on values/offsets: we’ll talk 
  Again, here, /k means “within k words of”.  about that in lecture 5  
  Clearly, posi)onal indexes can be used for such    Nevertheless, a posi)onal index expands pos)ngs 
queries; biword indexes cannot.  storage substan)ally 
  Exercise: Adapt the linear merge of pos)ngs to    Nevertheless, a posi)onal index is now standardly 
handle proximity queries.  Can you make it work for  used because of the power and usefulness of phrase 
any value of k?  and proximity queries … whether used explicitly or 
  This is a liWle tricky to do correctly and efficiently  implicitly in a ranking retrieval system. 
  See Figure 2.12 of IIR 
  There’s likely to be a problem on it! 
Posi)onal index size  Rules of thumb 
  Need an entry for each occurrence, not just once per    A posi)onal index is 2–4 as large as a non‐posi)onal 
document  index 
  Index size depends on average document size  Why?   Posi)onal index size 35–50% of volume of original 
  Average web page has <1000 terms  text 
  SEC filings, books, even some epic poems … easily 100,000    Caveat: all of this holds for “English‐like” languages 
terms 
  Consider a term with frequency 0.1% 
Document size Postings Positional postings
1000 1 1
100,000 1 100
Introduc)on to Informa)on Retrieval  Sec. 2.4.3 Introduc)on to Informa)on Retrieval 
Combina)on schemes  Resources for today’s lecture 
  These two approaches can be profitably    IIR 2 
combined    MG 3.6, 4.3; MIR 7.2 
  For par)cular phrases (“Michael Jackson”, “Britney    Porter’s stemmer: 
Spears”) it is inefficient to keep on merging posi)onal  hWp://www.tartarus.org/~mar)n/PorterStemmer/ 
pos)ngs lists    Skip Lists theory: Pugh (1990) 
  Even more so for phrases like “The Who”    Mul)level skip lists give same O(log n) efficiency as trees 
  Williams et al. (2004) evaluate a more    H.E. Williams, J. Zobel, and D. Bahle. 2004. “Fast Phrase
Querying with Combined Indexes”, ACM Transactions on
sophis)cated mixed indexing scheme  Information Systems.
  A typical web query mixture was executed in ¼ of the   hWp://www.seg.rmit.edu.au/research/research.php?author=4 
)me of using just a posi)onal index    D. Bahle, H. Williams, and J. Zobel. Efficient phrase querying with an 
auxiliary index. SIGIR 2002, pp. 215‐221. 
  It required 26% more space than having a posi)onal 
index alone 

Informa (On Retrieval: Recap of The Previous Lecture

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Informa (On Retrieval: Recap of The Previous Lecture

Загружено:

Авторское право:

Доступные форматы

Introduc)on to Informa)on Retrieval  Introduc)on to Informa)on Retrieval  Ch.

Introduc)on to Informa)on Retrieval  Sec. 2.1 Introduc)on to Informa)on Retrieval  Sec. 2.1

Introduc)on to Informa)on Retrieval  Sec. 2.2.1 Introduc)on to Informa)on Retrieval  Sec. 2.2.1

Introduc)on to Informa)on Retrieval  Sec. 2.2.1 Introduc)on to Informa)on Retrieval  Sec. 2.2.1

Introduc)on to Informa)on Retrieval  Sec. 2.2.3 Introduc)on to Informa)on Retrieval  Sec. 2.2.3

Introduc)on to Informa)on Retrieval  Sec. 2.2.3 Introduc)on to Informa)on Retrieval  Sec. 2.2.3

Introduc)on to Informa)on Retrieval  Sec. 2.2.4 Introduc)on to Informa)on Retrieval  Sec. 2.2.4

Introduc)on to Informa)on Retrieval  Sec. 2.2.4 Introduc)on to Informa)on Retrieval  Sec. 2.2.4

Introduc)on to Informa)on Retrieval  Sec. 2.2 Introduc)on to Informa)on Retrieval

MIT.english These may be

Introduc)on to Informa)on Retrieval  Sec. 2.3 Introduc)on to Informa)on Retrieval  Sec. 2.3

Suppose we’ve stepped through the lists until we

We then have 41 and 11 on the lower. 11 is smaller.

But the skip successor of 11 on the lower list is 31, so

Introduc)on to Informa)on Retrieval  Sec. 2.3 Introduc)on to Informa)on Retrieval

Introduc)on to Informa)on Retrieval  Sec. 2.4 Introduc)on to Informa)on Retrieval  Sec. 2.4.1

Introduc)on to Informa)on Retrieval  Sec. 2.4.1 Introduc)on to Informa)on Retrieval  Sec. 2.4.2

Introduc)on to Informa)on Retrieval  Sec. 2.4.2 Introduc)on to Informa)on Retrieval  Sec. 2.4.2

Introduc)on to Informa)on Retrieval  Sec. 2.4.2 Introduc)on to Informa)on Retrieval  Sec. 2.4.2

Introduc)on to Informa)on Retrieval  Sec. 2.4.3 Introduc)on to Informa)on Retrieval

Вам также может понравиться

Informa (On Retrieval: Recap of The Previous Lecture

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Informa (On Retrieval: Recap of The Previous Lecture

Загружено:

Авторское право:

Доступные форматы

Introduc)on to Informa)on Retrieval Introduc)on to Informa)on Retrieval Ch.

Introduc)on to Informa)on Retrieval Sec. 2.1 Introduc)on to Informa)on Retrieval Sec. 2.1

Introduc)on to Informa)on Retrieval Sec. 2.2.1 Introduc)on to Informa)on Retrieval Sec. 2.2.1

Introduc)on to Informa)on Retrieval Sec. 2.2.1 Introduc)on to Informa)on Retrieval Sec. 2.2.1

Introduc)on to Informa)on Retrieval Sec. 2.2.3 Introduc)on to Informa)on Retrieval Sec. 2.2.3

Introduc)on to Informa)on Retrieval Sec. 2.2.3 Introduc)on to Informa)on Retrieval Sec. 2.2.3

Introduc)on to Informa)on Retrieval Sec. 2.2.4 Introduc)on to Informa)on Retrieval Sec. 2.2.4

Introduc)on to Informa)on Retrieval Sec. 2.2.4 Introduc)on to Informa)on Retrieval Sec. 2.2.4

Introduc)on to Informa)on Retrieval Sec. 2.2 Introduc)on to Informa)on Retrieval

MIT.english These may be

Introduc)on to Informa)on Retrieval Sec. 2.3 Introduc)on to Informa)on Retrieval Sec. 2.3

Suppose we’ve stepped through the lists until we

We then have 41 and 11 on the lower. 11 is smaller.

But the skip successor of 11 on the lower list is 31, so

Introduc)on to Informa)on Retrieval Sec. 2.3 Introduc)on to Informa)on Retrieval

Introduc)on to Informa)on Retrieval Sec. 2.4 Introduc)on to Informa)on Retrieval Sec. 2.4.1

Introduc)on to Informa)on Retrieval Sec. 2.4.1 Introduc)on to Informa)on Retrieval Sec. 2.4.2

Introduc)on to Informa)on Retrieval Sec. 2.4.2 Introduc)on to Informa)on Retrieval Sec. 2.4.2

Introduc)on to Informa)on Retrieval Sec. 2.4.2 Introduc)on to Informa)on Retrieval Sec. 2.4.2

Introduc)on to Informa)on Retrieval Sec. 2.4.3 Introduc)on to Informa)on Retrieval

Вам также может понравиться

Introduc)on to Informa)on Retrieval  Introduc)on to Informa)on Retrieval  Ch.

Introduc)on to Informa)on Retrieval  Sec. 2.1 Introduc)on to Informa)on Retrieval  Sec. 2.1

Introduc)on to Informa)on Retrieval  Sec. 2.2.1 Introduc)on to Informa)on Retrieval  Sec. 2.2.1

Introduc)on to Informa)on Retrieval  Sec. 2.2.1 Introduc)on to Informa)on Retrieval  Sec. 2.2.1

Introduc)on to Informa)on Retrieval  Sec. 2.2.3 Introduc)on to Informa)on Retrieval  Sec. 2.2.3

Introduc)on to Informa)on Retrieval  Sec. 2.2.3 Introduc)on to Informa)on Retrieval  Sec. 2.2.3

Introduc)on to Informa)on Retrieval  Sec. 2.2.4 Introduc)on to Informa)on Retrieval  Sec. 2.2.4

Introduc)on to Informa)on Retrieval  Sec. 2.2.4 Introduc)on to Informa)on Retrieval  Sec. 2.2.4

Introduc)on to Informa)on Retrieval  Sec. 2.2 Introduc)on to Informa)on Retrieval 

Introduc)on to Informa)on Retrieval  Sec. 2.3 Introduc)on to Informa)on Retrieval  Sec. 2.3

Introduc)on to Informa)on Retrieval  Sec. 2.3 Introduc)on to Informa)on Retrieval 

Introduc)on to Informa)on Retrieval  Sec. 2.4 Introduc)on to Informa)on Retrieval  Sec. 2.4.1

Introduc)on to Informa)on Retrieval  Sec. 2.4.1 Introduc)on to Informa)on Retrieval  Sec. 2.4.2

Introduc)on to Informa)on Retrieval  Sec. 2.4.2 Introduc)on to Informa)on Retrieval  Sec. 2.4.2

Introduc)on to Informa)on Retrieval  Sec. 2.4.2 Introduc)on to Informa)on Retrieval  Sec. 2.4.2

Introduc)on to Informa)on Retrieval  Sec. 2.4.3 Introduc)on to Informa)on Retrieval