Вы находитесь на странице: 1из 4


CS 455 – Spring 2002


A cache memory, often simply called a cache, is a fast local memory used as a buffer for a more distant, larger
and slower memory in order to improve the average memory access speed. Although relatively small, cache
memories rely on the principle of locality that indicates that if they retain information recently referenced, they
will tend to contain much of the information needed in the near future.

There are two principal types of cache memory in a conventional computer architecture.

A data cache buffers transfers of data and instructions between the main memory and the processor.

A paging cache, also known by such names as the TLB or Translation Lookaside Buffer is used in a virtual
memory architecture to buffer the recently referenced entries from page and segment tables, thus avoiding an
extra memory reference when these items are needed again.

It can also be said that a virtual memory itself is a type of cache system in which the main memory is a buffer
for the secondary memory. A comparison among these three types of caches is given at the end of this chapter.

Structure of a cache memory

A typical cache memory consists of an array of fast registers, each of which can hold a unit of information.
Since caches only a small subset of the possible data items, each item must also be tagged with a field which
identifies the item. For a data cache, each item is a line of data, which is a small sequence of words, typically 16
or less. These are tagged with a line number. For a paging cache, the items are page or segment descriptors, and
the tags are the corresponding page or segment numbers.

Accessing an Item

To locate a word of memory for access using a data cache, for example, we must first determine the line number
for that word, and see if the line is in the cache. To do this a search must be made of the tags. To support a fast
search, the cache is structured as an associative memory, with circuitry that allows an immediate determination
of whether a particular tag is present, and a readout of its contents if present. If this test fails, the item is fetched
in the normal way from memory, and its line is added to the cache (see below).

Since a true associative search of a large cache may be infeasible, the cache often is organized as a set
associative memory in which the cache is broken into a number of smaller caches called sets. Each set is an
independent cache for a portion of the address space (or page/segment numbers). The sets, in turn, with only a
few registers each, are organized as fully associative memories.

Cache Memories – CS 455 JDM 3/20/02 1.

Updating the Cache

Since a cache memory uses the principle of locality, if an accessed item was not previously in the cache it
is normally added. To allow this, an item must be removed. A simple FIFO replacement algorithm can be
implemented by maintaining an index that cycles through all cache positions. Each new item goes into the next
position in order and overwrites the oldest entry. A bit more hardware will allow a true LRU algorithm,
especially with small sets. In this approach, a square array of bits can record the relative order of access to each
register in a set. The bits are shifted appropriately when an access occurs.

Updating Data Items

A different problem is how to update a data item once accessed, if a copy is in the cache. Since there are two
copies, they must be kept consistent. There are two approaches.

The write-through method updates both the cache and the main memory immediately. As a result consistency
is always maintained, but the speed benefit of the cache is lost for updates.

The write-back method updates only the cache and uses a flag bit to mark the register as modified or "dirty."
The main memory is updated when the item is replaced. With this approach many memory references are
avoided and the speed is greater. However, the hardware is more complex and there is a risk of having
inconsistent copies.

Multilevel Caches

Memory today is very inexpensive, and becoming increasingly large. Despite the principle of locality, a cache
will not function effectively if its size is many orders of magnitude smaller than the memory it is buffering.

A natural solution to this problem is to make caches larger as well, perhaps on the order of megabytes instead of
kilobytes. Such a cache may be able to hold a sufficient range of information, but is too large to be managed
effectively and accessed quickly. We now need a cache for the cache.

This trend leads to the use of multilevel caches: a small cache on the processor chip, and a larger cache on a
separate chip nearby. These are often called the Level 1 (L1) cache and the Level 2 (L2) cache, respectively.

Cache Memories – CS 455 JDM 3/20/02 2.

Problems with Cache Systems

Several problems can arise with cache systems if they are not managed carefully, especially if a write-back
update method is used. These problems arise especially in connection with input and output, virtual memory,
and in support of a multiuser environment.

In a multiuser environment, frequent process switching occurs. Locality does not apply across process changes.
Typically none of the items already in the cache will be useful to a new process. As a result, the new process
must gradually build up a set of useful cache entries from scratch. If, by the time this happens, another switch
occurs, the normal speed benefits of caching will not be attained.

In the presence of virtual memory, this problem is more acute. Since each process attaches different meaning to
the same addresses (and page numbers), old data in the cache is no longer even correct for a new process and
the cache must be completely flushed.

Since the most frequent switching occurs between user processes and the operating system itself, a partial
solution is to have two separate caches, one selected during privileged mode and used by the OS, the other to be
shared by user processes.

Input and output can also cause difficulties with a write-back system. A data cache is an element of the CPU.
Some I/O channels may access the memory independently via DMA techniques. As a result they will not go
through the cache, and will not be aware of locations that have been updated in the cache only. The only
solution is to avoid write-back or to provide the I/O DMA channels with their own cache that is cross-connected
to that of the CPU.

Cache Memories – CS 455 JDM 3/20/02 3.

Comparison of Cache Types

This table summarizes the characteristics of the three types of "cache" memories used in a typical storage
system: data cache, paging cache, and the virtual memory itself.


BUFFER Main memory Cache Cache
BULK STORAGE Secondary memory Main memory Main memory
UNITS OF STORAGE Pages of data Lines of data Page and segment
METHOD OF IDENTIFICATION Extract page and Extract line number Extract page and
segment number from address segment number
from virtual address from virtual address
METHOD OF ACCESS Fetch descriptor from Associative cache Associative cache
paging cache or search search
index into page table;
Use desciptor to
access page
IF NOT FOUND Issue page fault, Fetch from main, Fetch from main,
update and retry update cache update cache
HOW UPDATED By operating system By hardware By hardware
WHEN UPDATED On reference or before On reference On reference
WRITE STRATEGY Write-back Write-through or Write-through or
write-back write-back
REPLACEMENT STRATEGY By operating system; By hardware; By hardware;
May be complex FIFO or LRU FIFO or LRU
MULTIPLE LEVELS Not feasible Often Rarely

Cache Memories – CS 455 JDM 3/20/02 4.