Вы находитесь на странице: 1из 14
United States Patent 11» Li Us (s4] [75] Inventor: Yanhong Li, Scotch Plains, NJ. [73] Assignce: IDD Enterprises, LP, Now York, N.Y. [21] Appl. No.: 08794 428 [22] Filed: Feb. 8, 1997 [Si] Int. cL oor 17/30 [52] US.cl ‘T/S; 707/10; 707/501 [58] Field of Search 7072, 4, 5,10, T0701 [56] References Cited USS. PATENT DOCUMENTS 53408585 4/1905 Oren al 395/600 SHIS988 5/1905 Tonle 395/600 Sio801 $1995 Kaplan ei 395/600 53488725 1/1990 Tonle ca 7075 SI835905 11/1998 Proll et al 7018 OTHER PUBLICATIONS ‘Yuwono et al. “Search and Ranking Algorithms for Locating Resources on the World Wide Web’ 1986, Cheong, Fah-Chun, Internet Agents: Brokers and Bots, Chapter 4, Oct. 1995. IEEE, pp. 164-171 Spiders, Wanderers, Croft et al,“A Retrieval Model for Incorporating Hypertext 5,920,859 Jul. 6, 1999 005920859 (1) Patent Number: [45] Date of Patent: Bichtcler et al.,“‘The Combined Use of Bibliographic Cou- pling and Cocitation for Document Retrieval,” Journal of the American Society for Information Science, pp. 278-282 (ul. 1986). Dunlop et al, mation Processing & Managment, vol. 287-298 (1993). Froi et al, “The Use of Semantic Links in Hypertext Information Retrieval,” Information Processing & Manage- ‘ment, vol. 31, No. 1, pp. 1-13 (1995). Hypermedia and Free Text Retrieval,” Infor- 29, No. 3, pp. Primary Examiner—Dhomas G. Black Assistant Examiner—Joba C. Loomis Attorney, Agent, or Firm—Marshall, O'Toole, Gerstein, Murray & Borus (57) ABSTRACT ‘A search engine for retrieving documents pertinent 10 a {query indexes documents in accordance with hyperlinks pointing 10 those documents, The indexer traverses. the hhypertext database and finds hypertext information includ- ing the address of the document the hyperlinks point to and the anchor text of each hyperlink. ‘The information is stored in an inverted index file, which may also be used to calculate document link vectors for each hyperlink pointing to a particular document. When a query is entered, the search ‘engine finds all document vectors for documents baving the query terms in their anchor text. A query vector is also caleulated, and the dot product of the query vector and each document link vector is calculated. The dot products relating to a particular document are summed to determine the relevance ranking for each document, 25 Claims, 6 Drawing Sheets Links,” Hypertext °89 Proceeding, pp. 213-224, Nov. 1989. Harman, Donna, “Ranking Algorithms,” Information Resrieval, Chapter 14, pp. 363-371, 1992. 7 1208 Input user query ‘Search inverted fle — 1 java tutorial Find daoumens laid to wery q Find document tink vectors Calculate relovence score ae Sum relevance scores J ¥ Doc. B _tPaA 14-4" Se om a ava, tora | as | 1268 p= 6, tutorial, on java> it, Os 05,5 Doe. D: 4,08, 1 (Output score result ~g] Doe. 1620 Dee. 0149) “1328 5,920,859 Jul. 6, 1999 Sheet 1 of 6 US. Patent US. Patent Jul. 6, 1999 Sheet 2 of 6 5,920,859 34: 32 USER QUERY SEARCH RESULT 7) | t USER INTERFACE ca RETRIEVAL ENGINE [36 lf LINK FILE DOC. VECTOR FILE} | INVERTED FILE a INDEX FiLes 48 a \ 38 INDEX ENGINE | ¥ ¥ DOCUMENT DATABASE (WEB) Naa FIG, 2

Вам также может понравиться