Вы находитесь на странице: 1из 32

Web and HTTP

First, a review…
• web page consists of objects
• object can be HTML file, JPEG image, Java applet,
audio file,…
• web page consists of base HTML-file which includes
several referenced objects
• each object is addressable by a URL, e.g.,
www.someschool.edu/someDept/pic.gif

host name path name

Application Layer 2-1


HTTP Overview
HTTP: hypertext
transfer protocol
HT
TP
• Web s application layer r equ
protocol PC running est
HT
Firefox browser TP
r esp
• client/server model ons
e
• client: browser that
requests, receives, es
t
u
(using HTTP protocol) P r eq
se
server
and displays Web HT
T
es po
n running
r
objects T P Apache Web
HT server
• server: Web server
sends (using HTTP
protocol) objects in iphone running
response to requests Safari browser

Application Layer 2-2


Beyond the a Simple Web Server
• Simple case: Web servers have
static content
• Advanced scenario:Web servers
create content on the fly
• Data reside on a Database server
• Oracle, mySQL, SQLServer
• A server scripting language
retrieves the data and builds PC running server Server
HTML pages that were not stored Firefox browser running running
mySQL
on the web server Apache Web
server
• PHP, ASP.net, Python, NodeJS
Beyond the a Simple Web Server (contd.)
• Simple case: One server, one
web site server 1

• Does not scale


• Advanced scenario: One web
site, multiple servers
• A load balancer distributes the server 2
load equally among the servers
Load balancer Load balancer

server 3
HTTP Overview (continued)
uses TCP: HTTP is stateless
• client initiates TCP • server maintains no
connection (creates socket) information about
to server, port 80 past client requests
• server accepts TCP
connection from client aside
protocols that maintain
• HTTP messages state are complex!
(application-layer protocol v past history (state) must be
messages) exchanged maintained
between browser (HTTP v if server/client crashes, their
client) and Web server views of state may be
(HTTP server) inconsistent, must be
reconciled
• TCP connection closed

Application Layer 2-5


HTTP Connections
persistent HTTP
non-persistent HTTP
• multiple objects can be
• at most one object sent over sent over single TCP
TCP connection connection between
• connection then closed client, server
• downloading multiple objects
required multiple connections

Application Layer 2-6


Non-persistent HTTP
suppose user enters URL: (contains text,
www.someSchool.edu/someDepartment/home.index references to 10
jpeg images)
1a. HTTP client initiates TCP
connection to HTTP server
(process) at 1b. HTTP server at host
www.someSchool.edu on port 80 www.someSchool.edu waiting
for TCP connection at port 80.
accepts connection, notifying
2. HTTP client sends HTTP request client
message (containing URL) into
TCP connection socket. 3. HTTP server receives request
Message indicates that client message, forms response
wants object message containing requested
someDepartment/home.index object, and sends message into
its socket
time
Application Layer 2-7
Non-persistent HTTP (cont.)
4. HTTP server closes TCP
connection.
5. HTTP client receives response
message containing html file, displays
html. Parsing html file, finds 10
referenced jpeg objects

time
6. Steps 1-5 repeated for each of
10 jpeg objects

Application Layer 2-8


Non-persistent HTTP: Response Time
RTT (definition): time for a small
packet to travel from client to
server and back
HTTP response time:
initiate TCP
• one RTT to initiate TCP connection
connection RTT
• one RTT for HTTP request and request
first few bytes of HTTP file
response to return RTT
time to
transmit
• file transmission time file
file
• non-persistent HTTP response received
time =
2RTT+ file transmission time time
time

Application Layer 2-9


HTTP Request Message

• two types of HTTP messages: request, response


• HTTP request message:
• ASCII (human-readable format) carriage return character
line-feed character
request line
(GET, POST, GET /index.html HTTP/1.1\r\n
HEAD commands) Host: www-net.cs.umass.edu\r\n
User-Agent: Firefox/3.6.10\r\n
Accept: text/html,application/xhtml+xml\r\n
header Accept-Language: en-us,en;q=0.5\r\n
lines Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,utf-8;q=0.7\r\n
carriage return, Keep-Alive: 115\r\n
line feed at start Connection: keep-alive\r\n
\r\n
of line indicates
end of header lines
Application Layer 2-11
HTTP Request Message: General Format
method sp URL sp version cr lf request
line
header field name value cr lf
header
~
~ ~
~ lines

header field name value cr lf


cr lf

~
~ entity body ~
~ body

Application Layer 2-12


Uploading Form Input
POST method:
• web page often includes
form input
• input is uploaded to server
in entity body
URL method:
• uses GET method
• input is uploaded in URL
field of request line:

www.somesite.com/animalsearch?username=usr&pass=user123

Application Layer 2-13


HTTP Response Message
status line
(protocol
status code HTTP/1.1 200 OK\r\n
status phrase) Date: Sun, 26 Sep 2010 20:09:20 GMT\r\n
Server: Apache/2.0.52 (CentOS)\r\n
Last-Modified: Tue, 30 Oct 2007 17:00:02
GMT\r\n
header ETag: "17dc6-a5c-bf716880"\r\n
Accept-Ranges: bytes\r\n
lines Content-Length: 2652\r\n
Keep-Alive: timeout=10, max=100\r\n
Connection: Keep-Alive\r\n
Content-Type: text/html; charset=ISO-8859-
1\r\n
\r\n
data, e.g., data data data data data ...
requested
HTML file
Application Layer 2-15
HTTP Response Status Codes
v status code appears in 1st line in server-to-
client response message.
v some sample codes:
200 OK
• request succeeded, requested object later in this msg
301 Moved Permanently
• requested object moved, new location specified later in this msg
(Location:)
400 Bad Request
• request msg not understood by server
404 Not Found
• requested document not found on this server
505 HTTP Version Not Supported
Application Layer 2-16
User-server State: Cookies
example:
many Web sites use cookies
• Susan always access Internet
four components: from PC
1) cookie header line of
• visits specific e-commerce
HTTP response
site for first time
message
2) cookie header line in • when initial HTTP requests
next HTTP request arrives at site, site creates:
message • unique ID
3) cookie file kept on • entry in backend
user s host, managed database for ID
by user s browser
4) back-end database at
Web site
Application Layer 2-17
Cookies: keeping State (cont.)
client server

ebay 8734
usual http request msg Amazon server
cookie file creates ID
usual http response
1678 for user create backend
ebay 8734
set-cookie: 1678 entry database
amazon 1678
usual http request msg
cookie: 1678 cookie- access
specific
usual http response msg action

one week later:


access
ebay 8734 usual http request msg
amazon 1678 cookie: 1678 cookie-
specific
usual http response msg action
Application Layer 2-18
Cookies (continued)
aside
what cookies can be used cookies and privacy:
for: v cookies permit sites to
• authorization learn a lot about you
• shopping carts v you may supply name and

• recommendations e-mail to sites


• user session state (Web e-
mail)
how to keep state :
v protocol endpoints: maintain state at
sender/receiver over multiple
transactions
v cookies: http messages carry state

Application Layer 2-19


Web Caches (Proxy Server)
goal: satisfy client request without involving origin server
• user sets browser: Web
accesses via cache
• browser sends all HTTP HT proxy
TP st
requests to cache H
r equ server
P req
u e
client TTP est
HT
T nse
• object in cache: cache r esp
ons P res
po origin
T server
returns object s t
e HT
ue
• else cache requests r eq e
TTP o ns
p
object from origin H
TP
r es
T
server, then returns H
object to client client origin
server

Application Layer 2-20


More about Web Caching
• cache acts as both why Web caching?
client and server • reduce response time for
• server for original client request
requesting client
• client to origin server • reduce traffic on an
institution s access link
• typically cache is
• Internet dense with
installed by ISP caches: enables poor
(university, company, content providers to
residential ISP) effectively deliver content
(so too does P2P file
sharing)

Application Layer 2-21


Conditional GET
client server
• Goal: don t send object if
cache has up-to-date
cached version HTTP request msg
If-modified-since: <date> object
• no object transmission not
delay
modified
• lower link utilization HTTP response before
HTTP/1.0
• cache: specify date of 304 Not Modified <date>
cached copy in HTTP
request
If-modified-since:
<date>
HTTP request msg
• server: response contains If-modified-since: <date> object
no object if cached copy is modified
up-to-date: HTTP response after
HTTP/1.0 304 Not HTTP/1.0 200 OK <date>
Modified <data>
Application Layer 2-26
DNS: Domain Name System
people: many identifiers: Domain Name System:
• SSN, name, passport # • distributed database
Internet hosts, routers: implemented in hierarchy of
• IP address (32 bit) - many name servers
used for addressing • application-layer protocol: hosts,
datagrams name servers communicate to
• name , e.g., resolve names (address/name
www.yahoo.com - used translation)
by humans • note: core Internet function,
Q: how to map between IP implemented as application-
layer protocol
address and name, and vice
versa ? • complexity at network s
edge
Application Layer 2-27
DNS: Services, Structure
DNS services why not centralize DNS?
• hostname to IP address • single point of failure
translation • traffic volume
• host aliasing • distant centralized database
• canonical, alias names • maintenance
• mail server aliasing A: doesn t scale!
• load distribution
• replicated Web servers:
many IP addresses
correspond to one
name

Application Layer 2-28


DNS: a Distributed, Hierarchical Database
Root DNS Servers

… …

com DNS servers org DNS servers edu DNS servers

pbs.org poly.edu umass.edu


yahoo.com amazon.com
DNS servers DNS serversDNS servers
DNS servers DNS servers

client wants IP for www.amazon.com; 1st approx:


• client queries root server to find com DNS server
• client queries .com DNS server to get amazon.com DNS server
• client queries amazon.com DNS server to get IP address for
www.amazon.com

Application Layer 2-29


DNS: Root Name Servers
• contacted by local name server that can not resolve name
• root name server:
• contacts authoritative name server if name mapping not known
• gets mapping
• returns mapping to local name server
c. Cogent, Herndon, VA (5 other sites)
d. U Maryland College Park, MD k. RIPE London (17 other sites)
h. ARL Aberdeen, MD
j. Verisign, Dulles VA (69 other sites ) i. Netnod, Stockholm (37 other sites)

e. NASA Mt View, CA m. WIDE Tokyo


f. Internet Software C. (5 other sites)
Palo Alto, CA (and 48 other
sites)

a. Verisign, Los Angeles CA 13 root name


(5 other sites)
b. USC-ISI Marina del Rey, CA
servers
l. ICANN Los Angeles, CA worldwide
(41 other sites)
g. US DoD Columbus,
OH (5 other sites)

Application Layer 2-30


TLD, Authoritative Servers
top-level domain (TLD) servers:
• responsible for com, org, net, edu, aero, jobs, museums, and
all top-level country domains, e.g.: uk, fr, ca, jp
• Network Solutions maintains servers for .com TLD
• Educause for .edu TLD
authoritative DNS servers:
• organization s own DNS server(s), providing authoritative
hostname to IP mappings for organization s named hosts
• can be maintained by organization or service provider

Application Layer 2-31


Local DNS Name Server

• does not strictly belong to hierarchy


• each ISP (residential ISP, company, university) has one
• also called default name server
• when host makes DNS query, query is sent to its local DNS server
• has local cache of recent name-to-address translation pairs (but may be out of
date!)
• acts as proxy, forwards query into hierarchy

Application Layer 2-32


DNS Name root DNS server

Resolution Example
2
3
• host at cis.poly.edu wants TLD DNS server
IP address for 4
gaia.cs.umass.edu 5

local DNS server


iterated query: dns.poly.edu
v contacted server 1 8
7 6
replies with name of
server to contact authoritative DNS server
v I don t know this dns.cs.umass.edu
name, but ask this requesting host
server cis.poly.edu

gaia.cs.umass.edu

Application Layer 2-33


DNS name root DNS server
resolution example
2 3
recursive query: 7
6
v puts burden of name TLD DNS
resolution on server
contacted name
local DNS server
server dns.poly.edu 5 4
v heavy load at upper 1 8
levels of hierarchy?
authoritative DNS server
dns.cs.umass.edu
requesting host
cis.poly.edu

gaia.cs.umass.edu

Application Layer 2-34


DNS: Caching, Updating Records
• once (any) name server learns mapping, it caches
mapping
• cache entries timeout (disappear) after some time (TTL)
• TLD servers typically cached in local name servers
• thus root name servers not often visited
• cached entries may be out-of-date (best effort name-
to-address translation!)
• if name host changes IP address, may not be known
Internet-wide until all TTLs expire
• update/notify mechanisms proposed IETF standard
• RFC 2136

Application Layer 2-35


DNS Records
DNS: distributed db storing resource records (RR)
RR format: (name, value, type, ttl)

type=A type=CNAME
§ name is hostname § name is alias name for some
§ value is IP address canonical (the real) name
§ www.ibm.com is really
type=NS servereast.backup2.ibm.com
• name is domain (e.g., § value is canonical name
foo.com)
• value is hostname of
authoritative name
server for this domain
type=MX
§ value is name of mailserver
associated with name

Application Layer 2-36


DNS Protocol, Messages
• query and reply messages, both with same message format
2 bytes 2 bytes

identification flags
msg header
v identification: 16 bit # for # questions # answer RRs
query, reply to query uses
# authority RRs # additional RRs
same #
v flags: questions (variable # of questions)
§ query or reply
§ recursion desired
answers (variable # of RRs)
§ recursion available
§ reply is authoritative
authority (variable # of RRs)

additional info (variable # of RRs)

Application Layer 2-37


DNS Protocol, Messages

2 bytes 2 bytes

identification flags

# questions # answer RRs

# authority RRs # additional RRs

name, type fields


questions (variable # of questions)
for a query
RRs in response answers (variable # of RRs)
to query
records for
authority (variable # of RRs)
authoritative servers
additional helpful additional info (variable # of RRs)
info that may be used
Application Layer 2-38

Вам также может понравиться