Вы находитесь на странице: 1из 17

COMPARISON OF THREE

COMMONLY USED SEARCH


ENGINE TECHNIQUES

NADKAR ASHISH ANANT FULIA ISANKUMAR MANGALDAS


ashishanadkar@gmail.com isanfulia@gmail.com
Contact No: 9892073460 Contact No: 9987866545

FR. CONCEICAO RODRIGUES COLLEGE OF ENGINEERING


INTRODUCTION
 What is a Search Engine?
A program that searches documents for specified
keywords and returns a list of the documents where
the keywords were found .
SIMILARITIES..

The backbone of all major search engines is the


same involving three vital stages:
 Crawling
 Link map & Indexing
 Searching or Result matching
CRAWLER
Crawling process involves identifying, downloading and storing in a
database as many potentially useful web pages as possible, given
constraints of time, bandwidth and storage.

This data crawled is later used for indexing and searching.


Web crawlers are also called spiders or bots.

Some examples of specific search engine spiders are:

- Googlebot (Google)
- MSNbot (MSN)
- Slurp (Yahoo!)
INDEXER

Indexing is the process of

•extracting text from web pages,


•tokenizing it , then
•creating an index structure
(inverted index)
SEARCHING
Searching is the process of looking up words in an index
to find documents where they appear.

Quality of a search is typically described using the metrics:

•Recall measures how well the search system finds relevant


documents

•Precision measures how well the system filters out the


irrelevant documents.
FLOW
DISSIMILARITIES…

FACTOR 1:Ranking Algorithm

• Google uses PageRank


Algorithm

•Yahoo takes web directory as


part of its Ranking Algorithm, It
also shows selection based
search.

•MSN Live has a poor


Relevancy Algorithm
DISSIMILARITIES…

FACTOR 2 : Hit Count Estimation


Hit Count Estimation (HCE), is a number near the top of the results page
estimating the total number of results available to the search engine.
Multiple HCEs are used in research for comparisons.

• Google and Live Search having a particularly high value of HCE


showing high correlation
•Yahoo! correlates less with both Google and Live Search hence
have a low relative HCE.
DISSIMILARITIES…

FACTOR 3 : Page Content

Google heavily biases search results toward informational resources.

Yahoo! offers a paid inclusion program hence its page content is


commerce driven.

MSN places too much weight on page content. Show a heavy bias
towards commercial results.
DISSIMILARITIES…

FACTOR 4 :Crawling Ability

Google is most efficient at crawling ability than competing engines.

Yahoo! is pretty good at crawling sites deeply so long as they have


sufficient link popularity to get all their pages indexed.

MSN is nowhere near as comprehensive as Yahoo! or Google at


crawling deeply through large sites.
DISSIMILARITIES…

FACTOR 5 :Query Processing

Google is much better than Yahoo! or MSN at determining the true


intent of a query .
• It does concept matching.
• Search results are biased toward informational websites.

Yahoo! does more text matching when compared to Google.

MSN might be a bit better than Yahoo! at processing queries for


meaning instead of taking them quite so literally.
DISSIMILARITIES…

FACTOR 6 : Link Reputation & Site Age

Google is much better at differentiating between real editorial


citations and low quality, spammed, bought, or artificial links .Older
sites are trusted more.

In Yahoo! one can manipulate using low to mid quality links and
somewhat to aggressively focused anchor text. It places some weight
on older sites.

MSN is as good as other major search engines at telling the


difference between real organic citations and low quality links .MSN
search ranks new sites higher due to link bursts.
SEARCH ENGINE RATINGS
CONCLUSION

•Google is much better than Yahoo! or MSN at determining the true


intent of a query and trying to match that instead of doing direct text
matching.

•Yahoo has a bit of a bias toward commercial search results it is also


worth noting that Google's organic search results are heavily biased
toward informational websites and web pages.

• Google’s major data structures make efficient use of available storage


space.
Furthermore, the crawling, indexing, and sorting operations are
efficient enough to be able to build an index of a substantial portion of
the web 24 million pages, in less than one week.
OUR VIEWPOINTS

-Domain Specific search options


This will lead to the domain specific results (so more accurate) &
truncate the others, even the search will be quicker
-Search providing results for image queries
-Enhanced map searches which may make use of features like
Geotagging.
-Integration of different types of searches into a single search engine
This will enhance the overall browsing experience.
THANKS

Вам также может понравиться