Вы находитесь на странице: 1из 7

The web application I'm working on serves up images with URLs like /image?

name=a.gif. Pages are loading slowly partly because browsers are not caching the
images.

Is there any combination of http headers that will persuade IE to cache the images even
though the URL has a query string in it? I'm trying to avoid the browser making any
unnecessary requests (including if-modified-since type requests)

I'm happy with a solution that works with IE (6+) only. Also, I know that I can avoid the
problem by URL rewriting - I'm just interested in understanding browser caching better.

Thanks

Answer
Ordinarily, requests with query string parameters are cached separately for each
unique query string. This is according to RFC 2616/13.9 done only if an expiration time
is specified. The CacheIgnoreQueryString directive tells the cache to cache requests
even if no expiration time is specified, and to reply with a cached reply even if the query
string differs.
Set the expires header in your web server (for example apache server) to some far away
date.
It is important to specify one of Expires or Cache-Control max-age, and one of Last-
Modified or ETag, for all cacheable resources. It is redundant to specify both Expires
and Cache-Control: max-age, or to specify both Last-Modified and ETag.

When Does Browser Automatically Clear JavaScript


Cache?
Question:

I have a JavaScript resource that has the possibility of being edited at any time. Once it is
edited I would want it to be propagated to the user's browser relatively quickly (like
maybe 15 minutes or so), however, the frequency of this resource being editing is few
and far between (maybe 2 a month).

I'd rather the resource to be cached in the browser, since it will be retrieved frequently,
but I'd also like the cache to get reset on the browser at a semi-regular interval.

I know I can pass a no-cache header when I request for the resource, but I was wondering
when the cache would automatically reset itself on the browser if I did not pass no-cache.

I imagine this would be independent for each browser, but I'm not sure.
I tried to Google this, but most of the hits I found were about clearing the browser's
cache... which isn't what I'm looking for.

Answers
You may pass a version string as a get parameter to the URL of your script tag. The
parameter won't be evaluated by the static JavaScript file but force the browser to get the
new version.

If you do not want to assign the version string every time you edited the source you may
compute it based on the file system time stamp or your subversion commit number:

<script src="/script.js?time_stamp=1224147832156"
type="text/javascript"></script>
<script src="/script.js?svn_version=678"
type="text/javascript"></script>

Optimize caching
Most web pages include resources that change infrequently, such as CSS files, image
files, JavaScript files, and so on. These resources take time to download over the
network, which increases the time it takes to load a web page. HTTP caching allows
these resources to be saved, or cached, by a browser or proxy. Once a resource is cached,
a browser or proxy can refer to the locally cached copy instead of having to download it
again on subsequent visits to the web page. Thus caching is a double win: you reduce
round-trip time by eliminating numerous HTTP requests for the required resources, and
you substantially reduce the total payload size of the responses. Besides leading to a
dramatic reduction in page load time for subsequent user visits, enabling caching can also
significantly reduce the bandwidth and hosting costs for your site.

1. Leverage browser caching


2. Leverage proxy caching

Leverage browser caching


Overview

Setting an expiry date or a maximum age in the HTTP headers for static resources
instructs the browser to load previously downloaded resources from local disk rather than
over the network.
Details

HTTP/S supports local caching of static resources by the browser. Some of the newest
browsers (e.g. IE 7, Chrome) use a heuristic to decide how long to cache all resources
that don't have explicit caching headers. Other older browsers may require that caching
headers be set before they will fetch a resource from the cache; and some may never
cache any resources sent over SSL.

To take advantage of the full benefits of caching consistently across all browsers, we
recommend that you configure your web server to explicitly set caching headers and
apply them to all cacheable static resources, not just a small subset (such as
images). Cacheable resources include JS and CSS files, image files, and other binary
object files (media files, PDFs, Flash files, etc.). In general, HTML is not static, and
shouldn't be considered cacheable.

HTTP/1.1 provides the following caching response headers :

• Expires and Cache-Control: max-age. These specify the “freshness lifetime”


of a resource, that is, the time period during which the browser can use the cached
resource without checking to see if a new version is available from the web
server. They are "strong caching headers" that apply unconditionally; that is, once
they're set and the resource is downloaded, the browser will not issue any GET
requests for the resource until the expiry date or maximum age is reached.
• Last-Modified and ETag. These specify some characteristic about the resource
that the browser checks to determine if the files are the same. In the Last-
Modified header, this is always a date. In the ETag header, this can be any value
that uniquely identifies a resource (file versions or content hashes are typical).
Last-Modified is a "weak" caching header in that the browser applies a heuristic
to determine whether to fetch the item from cache or not. (The heuristics are
different among different browsers.) However, these headers allow the browser to
efficiently update its cached resources by issuing conditional GET requests when
the user explicitly reloads the page. Conditional GETs don't return the full
response unless the resource has changed at the server, and thus have lower
latency than full GETs.

It is important to specify one of Expires or Cache-Control max-age, and one of Last-


Modified or ETag, for all cacheable resources. It is redundant to specify both Expires
and Cache-Control: max-age, or to specify both Last-Modified and ETag.

Recommendations
Set caching headers aggressively for all static resources.
For all cacheable resources, we recommend the following settings:

• Set Expires to a minimum of one month, and preferably up to one


year, in the future. (We prefer Expires over Cache-Control: max-age
because it is is more widely supported.) Do not set it to more than one year
in the future, as that violates the RFC guidelines.

If you know exactly when a resource is going to change, setting a shorter


expiration is okay. But if you think it "might change soon" but don't know
when, you should set a long expiration and use URL fingerprinting
(described below). Setting caching aggressively does not "pollute"
browser caches: as far as we know, all browsers clear their caches
according to a Least Recently Used algorithm; we are not aware of any
browsers that wait until resources expire before purging them.

• Set the Last-Modified date to the last time the resource was
changed. If the Last-Modified date is sufficiently far enough in the past,
chances are the browser won't refetch it.

Use fingerprinting to dynamically enable caching.


For resources that change occasionally, you can have the browser cache the
resource until it changes on the server, at which point the server tells the browser
that a new version is available. You accomplish this by embedding a fingerprint
of the resource in its URL (i.e. the file path). When the resource changes, so does
its fingerprint, and in turn, so does its URL. As soon as the URL changes, the
browser is forced to re-fetch the resource. Fingerprinting allows you to set expiry
dates long into the future even for resources that change more frequently than
that. Of course, this technique requires that all of the pages that reference the
resource know about the fingerprinted URL, which may or may not be feasible,
depending on how your pages are coded.
Set the Vary header correctly for Internet Explorer.
Internet Explorer does not cache any resources that are served with the Vary
header and any fields but Accept-Encoding and User-Agent. To ensure these
resources are cached by IE, make sure to strip out any other fields from the Vary
header, or remove the Vary header altogether if possible
Avoid URLs that cause cache collisions in Firefox.
The Firefox disk cache hash functions can generate collisions for URLs that differ
only slightly, namely only on 8-character boundaries. When resources hash to the
same key, only one of the resources is persisted to disk cache; the remaining
resources with the same key have to be re-fetched across browser restarts. Thus, if
you are using fingerprinting or are otherwise programmatically generating file
URLs, to maximize cache hit rate, avoid the Firefox hash collision issue by
ensuring that your application generates URLs that differ on more than 8-
character boundaries.
Use the Cache control: public directive to enable HTTPS caching for Firefox.
Some versions of Firefox require that the Cache control: public header to be
set in order for resources sent over SSL to be cached on disk, even if the other
caching headers are explicitly set. Although this header is normally used to enable
caching by proxy servers (as described below), proxies cannot cache any content
sent over HTTPS, so it is always safe to set this header for HTTPS resources.
Example

For the stylesheet used to display the user's calendar after login, Google Calendar embeds
a fingerprint in its filename: calendar/static/fingerprint_keydoozercompiled.css, where
the fingerprint key is a 128-bit hexadecimal number. At the time of the screen shot below
(taken from Page Speed's Show Resources panel), the fingerprint was set to
82b6bc440914c01297b99b4bca641a5d:

The fingerprinting mechanism allows the server to set the Expires header to exactly one
year ahead of the request date; the Last-Modified header to the date the file was last
modified; and the Cache-Control: max-age header to 3153600. To cause the client to
re-download the file in case it changes before its expiry date or maximum age, the
fingerprint (and therefore the URL) changes whenever the file's content does.

Additional resources

• For an in-depth explanation of HTTP caching, see the HTTP/1.1 RFC, sections
13.2, 14.21, and 14.9.3.
• For details on enabling caching in Apache, consult the Apache Caching Guide.

Back to top

Leverage proxy caching


Overview

Enabling public caching in the HTTP headers for static resources allows the browser to
download resources from a nearby proxy server rather than from a remoter origin server.

Details

In addition to browser caching, HTTP provides for proxy caching, which enables static
resources to be cached on public web proxy servers, most notably those used by ISPs.
This means that even first-time users to your site can benefit from caching: once a static
resource has been requested by one user through the proxy, that resource is available for
all other users whose requests go through that same proxy. Since those locations are
likely to be in closer network proximity to your users than your servers, proxy caching
can result in a significant reduction in network latency. Also, if enabled proxy caching
effectively gives you free web site hosting, since responses served from proxy caches
don't draw on your servers' bandwidth at all.

You use the Cache-control: public header to indicate that a resource can be cached
by public web proxies in addition to the browser that issued the request. With some
exceptions (described below), you should configure your web server to set this header to
public for cacheable resources.

Recommendations
Don't include a query string in the URL for static resources.
Most proxies, most notably Squid up through version 3.0, do not cache resources
with a "?" in their URL even if a Cache-control: public header is present in
the response. To enable proxy caching for these resources, remove query strings
from references to static resources, and instead encode the parameters into the file
names themselves.
Don't enable proxy caching for resources that set cookies.
Setting the header to public effectively shares resources among multiple users,
which means that any cookies set for those resources are shared as well. While
many proxies won't actually cache any resources with cookie headers set, it's
better to avoid the risk altogether. Either set the Cache-Control header to
private or serve these resources from a cookieless domain.
Be aware of issues with proxy caching of JS and CSS files.
Some public proxies have bugs that do not detect the presence of the Content-
Encoding response header. This can result in compressed versions being
delivered to client browsers that cannot properly decompress the files. Since these
files should always be gzipped by your server, to ensure that the client can
correctly read the files, do either of the following:

• Set the the Cache-Control header to private. This disables


proxy caching altogether for these resources. If your application is multi-
homed around the globe and relies less on proxy caches for user locality,
this might be an appropriate setting.
• Set the Vary: Accept-Encoding response header. This instructs
the proxies to cache two versions of the resource: one compressed, and
one uncompressed. The correct version of the resource is delivered based
on the client request header. This is a good choice for applications that are
singly homed and depend on public proxies for user locality.

Вам также может понравиться