Cache 6

Miss Penalty Reduction
ECE 463/521, Profs Conte/Rotenberg/Sair

Conte/Rotenberg/Sair,, Dept. of ECE, NC State University
CACHE6-1
Early restart/Critical word first

m
Early Restart:
u As soon as the requested word arrives, forward it to the
processor
u Miss penalty is now time to fetch the requested word
Critical Word First:

u Problem still with Early Restart: What if requested word
is in the middle of the block?
u Start fetching the block with the required (critical)
word, and fill in the rest

CACHE6-2
Miss penalty reduction via L2 caches

m
If it takes 100 cycles to go to main memory,

why not put a cache between the L1 cache
and main memory?
Average access time = Hit time L1 + Miss rate L1 Miss penaltyL1
Miss penalty L1 = Hit time L2 + Miss rate L2 Miss penalty L2
L1 cache
4 cycles
L2 cache
100 cycles
main memory
Hit time L1 = 1 cycle

Miss rate L1 = 0.05
Hit time L2 = 4 cycles
Miss rateL2 = 0.01
Miss penaltyL2 = 100 cycles
Miss penaltyL1 = 4 + 0.01 100 = 5 cycles
Average access time = 1 + 0.05 5
= 1.25 cycles (w/o L2 : 6 cycles)

CACHE6-3
Sub-blocking
m
m
Problem: Tags are overhead, they take up extra space

Partial Solution: Large blocks reduce amount of tag storage
u
u
u
u
u
Say we keep cache size fixed, but double block size

Then the number of blocks is halved
The number of tags is also halved (# tags == # blocks)
So, weve reduced amount of tag overhead, conserving space on the chip
BUT large blocks increase the cache miss penalty
Complete Solution:
u
u
u
u
Use large blocks, but also divide blocks into smaller sub-blocks
Fetch only 1 sub-block on a miss
Keep valid bits for sub-blocks
Better if other sub-blocks are prefetched in the background (that is,
combine sub-blocking with early restart and critical word first)
block
valid bits
tag
sub-block

CACHE6-4
Write buffer in WTNA

m
WTNA
1. Write to L1 cache, if block is there
2. Send write to next level in the memory hierarchy (may
percolate further down until either a WBWA cache or
main memory is reached)
Dont stall processor

u Do the write in the background
u Need a write buffer
u Processor delegates responsibility to the write buffer,
for propagating the write to the memory hierarchy

CACHE6-5
Write buffer in WTNA (cont.)

m
Consider a read miss

u Before L1 cache requests block from next level, must check
the write buffer for any pending writes to the requested
block
If there are any pending writes to the block, either:
l Drain the write buffer before requesting the block from next
level (i.e., wait until writes are performed)
-ORl Request the block and merge in the write data when the block
arrives
u Cheaper solution
Do not check write buffer
Must always drain write buffer before requesting a
block from next level (whether or not there are
pending writes to the requested block)

CACHE6-6
Write buffer in WTNA (cont.)

write request
from processor
L1 cache
WTNA policy
data
address
extract
block addr
read miss
block address
=?
=?
=?
=?
write to next level
Next level in mem. hier.

(L2 cache or main memory)

CACHE6-7
Write buffer in WBWA

m
WBWA
u Write miss, like read miss, causes an allocation
u Stall processor until requested block is received, then
perform write to the block
Dont stall processor

u Do the allocation and write in the background
u Need a write buffer (and more see next slide)
u Processor delegates responsibility to the write buffer,
for performing the write when the requested block
makes it into the cache

CACHE6-8
Write buffer in WBWA (cont.)

m
But theres more to it

u Need special hardware to track the miss
Miss Handling Status Register (MHSR)
u This takes us into the realm of a more sophisticated
class of caches, non-blocking (or lockup-free) caches,
that support:
Hit-under-miss
l Free up the cache port for subsequent cache hits, even
though there is an outstanding miss
Miss-under-miss
l Support multiple outstanding misses (multiple MHSRs)

u Previous slide requires a limited non-blocking cache
Hit-under-write-miss
l Subsequent read and write hits may proceed despite prior
write misses
Write-miss-under-write-miss
l One or more MHSRs for tracking write misses
Read miss stalls our simple in-order processor,

so nothing fancy with respect to read misses

CACHE6-9
Writeback buffer in WBWA

m
Recall what causes a writeback

u Read or write misses in cache (requested block X not found)
u Identify a victim block V for replacement
u If block V is dirty, write-back block V to next level
u Fetch block X
u Finally perform read or write
Writeback need not stall this process

u Processor wants block X
u Block V has nothing to do with block X
Except that V needs to leave to make room for X
Dont need to wait for writeback
Transfer V from cache to a temporary holding area
(writeback buffer)
Deal with writeback after X has been all taken care of

CACHE6-10

Cache 6

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Cache 6

Загружено:

Авторское право:

Доступные форматы

Miss Penalty Reduction

ECE 463/521, Profs Conte/Rotenberg/Sair

Early restart/Critical word first

Critical Word First:

ECE 463/521, Profs Conte/Rotenberg/Sair

Miss penalty reduction via L2 caches

If it takes 100 cycles to go to main memory,

Hit time L1 = 1 cycle

ECE 463/521, Profs Conte/Rotenberg/Sair

Problem: Tags are overhead, they take up extra space

Say we keep cache size fixed, but double block size

ECE 463/521, Profs Conte/Rotenberg/Sair

Write buffer in WTNA

Dont stall processor

ECE 463/521, Profs Conte/Rotenberg/Sair

Write buffer in WTNA (cont.)

Consider a read miss

ECE 463/521, Profs Conte/Rotenberg/Sair

Write buffer in WTNA (cont.)

write to next level

Next level in mem. hier.

ECE 463/521, Profs Conte/Rotenberg/Sair

Write buffer in WBWA

Dont stall processor

ECE 463/521, Profs Conte/Rotenberg/Sair

Write buffer in WBWA (cont.)

But theres more to it

l Support multiple outstanding misses (multiple MHSRs)

l One or more MHSRs for tracking write misses

Read miss stalls our simple in-order processor,

ECE 463/521, Profs Conte/Rotenberg/Sair

Writeback buffer in WBWA

Recall what causes a writeback

Writeback need not stall this process

ECE 463/521, Profs Conte/Rotenberg/Sair

Вам также может понравиться