Академический Документы
Профессиональный Документы
Культура Документы
Table of Contents
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Part 1: Web Server Caching. . . . . . . . . . . . . . . . . . . . 5
Part 2: Application Caching. . . . . . . . . . . . . . . . . . . . 8
Part 3: Data Caching. . . . . . . . . . . . . . . . . . . . . . . . . 11
Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
Introduction:
Web site optimization directly affects a
company's bottom line. A sudden traffic spike
that swamps a website's capacity can cost a
company thousands or even tens of
thousands of dollars per hour. Web servers
and Web applications should be built and
deployed from day one with performance at
the forefront of everyone's mind.
Web site administrators and web application developers have a host of tricks
and techniques they can employ to deliver Web pages more quickly. Caching is the
#1 tuning trick in the web developers kit. Customers ask me weekly what I recommend
for speeding up their app. I always start with "caching, caching, and more caching". It's
like magic for a site.
What is Caching?
Caching refers to any mechanism that stores previously retrieved content for future
use. As I learned it in college back in the VAX/VMS operating systems class, it is
temporarily putting something into memory that you will use again in order to avoid
hitting the hard drive.
Computer scientists today are less concerned about saving every byte like we were
back then. Still, web applications are constantly re-using data and files; so why in the
world would we want to make an expensive hit to the database? Hard drives can be
10,000 times slower than memory because they are mechanical and must move to the
correct position and spin to the exact spot where data exists. Memory moves at the
speed of electricity.
The goal of caching is to increase the speed of content delivery by reducing the
amount of redundant work a server needs to perform. Putting a file in memory to
re-use it can save millions of drive accesses; thus, the speed of getting the browser
what the user needs is increased by magnitudes. Caching and performance go handin-hand. It's a no-brainer.
The Expires header tells a Web browser that a specif ic type of header expires at a
specif ic day and time. The Cache-control header uses a combination of caching
directives and age modif iers to instruct a Web client on whether a specif ic piece of
content can or cannot be cached, and for how long. The Cache-control header is
documented in section 14.9 of RFC2616. [1]
Web servers can also use the ETag header, which assigns a unique ID to each version
of a file, image, or other component. When the Web server delivers content, it stamps
it with an ETag value. Later, if the client believes the content might have expired, it
makes an HTTP request including an If-None-Match header, with the value of that
header set to the last ETag value for that component. If the content has changed since
that ETag value was issued, the server will respond with new content. Otherwise, if the
content has not changed, the server will response with a 304 HTTP status and an
empty HTTP response body. ETags prevent a Web server from re-delivering content
that is still fresh, thereby conserving server resources and reducing page load time.
Cisco describes a proxy cache this way: When a browser wishes to retrieve a URL, it
takes the host name component and translates that name to an IP address.[2] A HTTP
session is opened against that address, and the client requests the URL from the
server.
When using a proxy cache, not much is altered in the transaction. The client opens a
HTTP session with the proxy cache, and directs the URL request to the proxy cache
instead.
Cache proxies work the same way that a Web browser's cache works: they use the
information included in HTTP headers such as Expires and Cache-control to
determine if a given component is fresh or stale. Setting accurate cache control
headers on content is critical for the success of a Web cache proxy.
Most high-volume Web projects use some form of Web cache proxy. For example,
Wikimedia deploys 50 instances of the Squid web cache proxy in three locations
worldwide to speed delivery of content to users.[3]
UI Layer: The browsing and shopping cart interface shown to the users.
Data Layer: The catalog of books, record of orders, and order workflow storage.
Obviously, this bookstore is highly dynamic: the server must generate a slightly
different interface for each user based on the user's account preferences and current
shopping activity. Exchanging data between these layers and generating every
component required for every request decreases site response times. Not every
component, however, needs to be generated fresh every time. For example, each Web
page may have its header and footer broken out into static components that can be
cached by the application.
The shopping site may also elect to cache dynamically generated data that change
infrequently, such as selection lists of current shipping rates.
Even the user's current shopping cart status can be cached, as there is no need for
the server to regenerate that data until the user invalidates it by changing the contents
of her cart.
Storing components in memory for later use can represent a huge performance
savings for the application, as it reduces database retrieval requests, numeric
calculations, and string concatenation operations.
Considerations in Caching:
Most application caching takes place on the server side. When caching data on the
server, applications must balance a number of factors:
Scope:
Who can reuse the cached component? Some components, such as shopping
cart content, may be specif ic to the current user. Other components, such as a
list of shipping rates, may be global, and can be placed in a memory pool
shared by all server connections. For global components, applications that run
in distributed server environments must implement a caching mechanism that
can be accessed across multiple Web servers. This can be accomplished either
through some centralized cache (such as a Web Service that accesses a
common data set), or by synchronizing the cache across all servers.
Server Resources:
Cached data reduces both processing and disk access time at the expense of
system memory. Application caches require a mechanism to scavenge items out
of the cache should memory become scarce. Applications must also take care
not to exhaust system memory themselves through aggressive caching.
Invalidation and Expiration:
Just as with server and client resource caching, application caching requires a
mechanism to invalidate a cached item. Applications can assign expiration times
to these components, just as Web servers do with files served over HTTP.
Alternatively, components can be expired in response to an event, such as a
database update or user action. In our shopping cart example, the user's
shopping cart component can be expired whenever the user modif ies her cart.
10
11
12
13
14
Summary:
There are many ways to improve the speed of your site. This eBook is designed to
give you practical suggestions to try in your system environment. Web performance
optimization is an engineering discipline which can have a tremendous impact on
revenue and customer satisfaction. Caching is probably the biggest weapon in the
arsenal of performance engineers, and it is usually relatively easy to utilize.
Get a FREE account and storm your site with 25 concurrent V-Users
15
References
1. http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
2. http://www.cisco.com/web/about/ac123/ac147/ac174/ac199/about_cisco_ipj_arc
hive_article09186a00800c8903.html
3. http://meta.wikimedia.org/wiki/Wikimedia_servers
4. http://jakarta.apache.org/jcs
5. http://php.net/manual/en/book.apc.php
6. http://msdn.microsoft.com/en-us/magazine/cc188799.aspx
7. http://www.ibm.com/developerworks/java/library/j-sdo/
8. http://www.oracle.com/technetwork/database/timesten/overview/timesten-imdb086887.html
9. http://glinden.blogspot.com/2009/11/put-that-database-in-memory.html
10. http://loadstorm.com/2009/web-performance-tuning
16