Академический Документы
Профессиональный Документы
Культура Документы
2.1 Web Documents.
2.2 Resource Identifiers: URI, URL, and URN.
2.3 Protocols.
2.4 Log Files.
2.5 Search Engines.
Resources:
• Conceptual mappings to concrete or abstract entities, which do not
change in the short term
• ex: DTU website (web pages and other kinds of files)
Resource identifiers (hyperlinks):
• Strings of characters represent generalized addresses that may
contain instructions for accessing the identified resource
• http://www.ics.uci.edu is used to identify the ICS homepage
Transfer protocols:
• Conventions that regulate the communication between a browser
(web user agent) and a server
<a href="relations/alumni">alumni</a>
• A link is a connection from one Web resource
to another
• It has two ends, called anchors, and a direction
• Starts at the "source" anchor and points to the
"destination" anchor, which may be any Web
resource (e.g., an image, a video clip, a sound
bite, a program, an HTML document)
• BUSINESS INTELLIGENCE
• MONITOR WEB SITES AND PAGES OF
INTEREST
• UNIVERSAL CRAWLERs
• PREFERENTIAL CRAWLERS