Академический Документы
Профессиональный Документы
Культура Документы
320$+H =360$
Conf, Hit rate Cost Memory Speed Time for 100
320$+20$+H’=400$
16K, 90% 3, 15 67+33*90%*3+33*10%*15=~206
1/2M, 96% 3,15 67+33*96%*3+33*4%*15 = ~182
Time
“Sweet Spot”
X
Cost
Can get most of the performance for a fraction of the cost!
Based on Yaron Sheffer slides, p-6 Micro Computers Class
In Practice...
L1 Cache
L2
CPU Memory
Cache
320$+H =360$
Conf, Hit rate Cost Memory Speed Time for 100
320$+20$+H’=400$
16K, 90% 3, 15 67+33*90%*3+33*10%*15=~205
1/2M, 96% 3,15 67+33*96%*3+33*4%*15 = ~182
16K, 90% on chip 1, 15 67+33*90%*1+33*10%*15=~146
16K+1/2M +40$ 1,3,15 67+33*90%*1+
33*6%*3+33*4%*15=~122
● Cache “organization”
Determines which memory portion can be mapped to each line
i A portion of memory can be mapped to only a subset of the lines in the
cache. This subset is called “set”
● Options:
i Direct Map
i Set Associative
i Fully Associative
i Sectored
Address 0x12345678
0001 0010 0011 0100
0101 0110 0111 1000 Set#
Lowest power
Cache Line 1
Size
● Disadvantage x
Compare
Hit/Miss x
Cache Line n
Size
Tag Array
Direct Map Line mapping
Direct Map Hit/Miss decision
Based on Yaron Sheffer slides, p-12 Micro Computers Class
2-Way Set Associative Cache
● Memory is “conceptually” divided into slices
whose size is 1/2 the cache size
● Offset from from slice start indicates set#
But each set contains now two potential lines!
● Addresses w/ same offset map into the same set
● Two tags per set, one tag per line is needed
Address Fields
31 12 4 0 Example:
Tag Set Line Offset Line Size: 32 bytes
Cache Size 16KB
# of lines 512 lines
Tag Line Tag Line #sets 256
Offset bits 5 bits
Set bits 8 bits
Tag bits 19 bits
Address 0x12345678
0001 0010 0011 0100
Set# Set# 010
01 0110 0111 1000
Set# Set#
+
Data Out
Hit/Miss
More:
● Memory update policies (see later)
i Write-through: memory is always updated
i Write-Back: smarter update, only when needed
Faster, better, but more complex to implement
● Unified/Split Caches
i Split instruction and data (common)
i Split kernel/user (rare)
● Indexing: Virtual/Physical caches
i Virtual is faster (no TLB access), but consistency is hard
Based on Yaron Sheffer slides, p-17 Micro Computers Class
Write Hit Policy
● WT – Write Through
i All memory writes are written to main memory, even if the line is in
i Do nothing on replacement
i Valid / invalid
i Simple, slow, lot of bus traffic (immediate writes, small chunks)
i Write buffers are used to avoid write latency issues
● WB – Write Back
i Hold a line in the cache, even if modified
i Write back line to memory on replacement
i Invalid, Non-modified, Modified (Dirty bit)
i Smarter update, only when needed
i Faster, better, but more complex to implement
● Write Allocate
i Write to cache
i The line is loaded, followed by the write-hit actions above.
i Similar to a read miss.
● No Write Allocate
i Bypass cache
i Write directly to main memory
●INVD:
Invalidate entire cache(15 cycles)
●WBINVD:
Write Back and Invalidate entire cache (>2000 cycles)