Академический Документы
Профессиональный Документы
Культура Документы
Instructors:
Randy
Bryant
and
Dave
OHallaron
Carnegie Mellon
Today
Carnegie Mellon
Key
features
RAM
is
tradi0onally
packaged
as
a
chip.
Basic
storage
unit
is
normally
a
cell
(one
bit
per
cell).
Mul0ple
RAM
chips
form
a
memory.
Each
cell
stores
bit
with
a
capacitor.
One
transistor
is
used
for
access
Value
must
be
refreshed
every
10-100
ms.
More
sensi0ve
to
disturbances
(EMI,
radia0on,)
than
SRAM.
Slower
and
cheaper
than
SRAM.
Carnegie Mellon
Trans.
per bit
Cost
Applications
SRAM
4 or 6
1X
No
Maybe
100x
Cache memories
DRAM
10X
Yes
Yes
1X
Main memories,
frame buffers
Carnegie Mellon
d
x
w
DRAM:
dw
total
bits
organized
as
d
supercells
of
size
w
bits
16 x 8 DRAM chip
0
2 bits
/
(to/from CPU)
rows
1
supercell
(2,1)
2
8 bits
/
addr
Memory
controller
cols
data
Carnegie Mellon
16 x 8 DRAM chip
0
RAS = 2
2
/
addr
Rows
Memory
controller
Cols
1
2
8
/
data
Carnegie Mellon
16 x 8 DRAM chip
0
CAS = 1
2
/
Rows
Memory
controller
supercell
(2,1)
addr
To CPU
Cols
1
2
8
/
data
supercell
(2,1)
Carnegie Mellon
Memory
Modules
addr (row = i, col = j)
: supercell (i,j)
DRAM 0
64 MB
memory module
consisting of
eight 8Mx8 DRAMs
DRAM 7
bits
bits
56-63 48-55
63
56 55
48 47
bits
40-47
40 39
bits
32-39
bits
24-31
bits
16-23
32 31
24 23
16 15
bits
8-15
bits
0-7
8 7
Memory
controller
64-bit doubleword
8
Carnegie Mellon
Enhanced DRAMs
Basic
DRAM
cell
has
not
changed
since
its
invenAon
in
1966.
Commercialized
by
Intel
in
1970.
Double
edge
clocking
sends
two
bits
per
cycle
per
pin
Dierent
types
dis0nguished
by
size
of
small
prefetch
buer:
DDR
(2
bits),
DDR2
(4
bits),
DDR4
(8
bits)
By
2010,
standard
for
most
server
and
desktop
systems
Intel
Core
i7
supports
only
DDR3
SDRAM
9
Carnegie Mellon
NonvolaAle Memories
10
Carnegie Mellon
Bus interface
I/O
bridge
Memory bus
Main
memory
11
Carnegie Mellon
I/O bridge
Bus interface
Main memory
A
x
0
A
12
Carnegie Mellon
Bus interface
x
x
0
A
13
Carnegie Mellon
CPU
read
word
x
from
the
bus
and
copies
it
into
register
%eax.
Register file
%eax
I/O bridge
Bus interface
Main memory
x
14
Carnegie Mellon
I/O bridge
Bus interface
Main memory
0
A
15
Carnegie Mellon
Bus interface
0
A
16
Carnegie Mellon
Main
memory
reads
data
word
y
from
the
bus
and
stores
it
at
address
A.
register file
%eax
I/O bridge
bus interface
17
Carnegie Mellon
Spindle
Platters
Actuator
SCSI
connector
Electronics
(including a
processor
and memory!)
Image courtesy of Seagate Technology
18
Carnegie Mellon
Disk Geometry
Surface
Track k
Gaps
Spindle
Sectors
19
Carnegie Mellon
Platter 0
Surface 1
Surface 2
Platter 1
Surface 3
Surface 4
Platter 2
Surface 5
Spindle
20
Carnegie Mellon
Disk Capacity
21
Carnegie Mellon
22
Carnegie Mellon
spindle
spindle
spindle
23
Carnegie Mellon
Arm
Spindle
24
Carnegie Mellon
25
Carnegie Mellon
Disk Access
26
Carnegie Mellon
Disk Access
Rotation is counter-clockwise
27
Carnegie Mellon
28
Carnegie Mellon
29
Carnegie Mellon
30
Carnegie Mellon
31
Carnegie Mellon
Rotational latency
32
Carnegie Mellon
Rotational latency
33
Carnegie Mellon
Rotational latency
Data transfer
Seek
RotaAonal
latency
Data transfer
34
Carnegie Mellon
Carnegie Mellon
Given:
Rota0onal
rate
=
7,200
RPM
Average
seek
0me
=
9
ms.
Avg
#
sectors/track
=
400.
Derived:
Tavg
rota0on
=
1/2
x
(60
secs/7200
RPM)
x
1000
ms/sec
=
4
ms.
Tavg
transfer
=
60/7200
RPM
x
1/400
secs/track
x
1000
ms/sec
=
0.02
ms
Taccess
=
9
ms
+
4
ms
+
0.02
ms
Important
points:
Access
0me
dominated
by
seek
0me
and
rota0onal
latency.
First
bit
in
a
sector
is
the
most
expensive,
the
rest
are
free.
SRAM
access
0me
is
about
4
ns/doubleword,
DRAM
about
60
ns
36
Carnegie Mellon
37
Carnegie Mellon
I/O
Bus
CPU chip
Register file
ALU
System bus
Memory bus
Main
memory
I/O
bridge
Bus interface
I/O bus
USB
controller
Graphics
adapter
Mouse Keyboard
Monitor
Disk
controller
Disk
38
Carnegie Mellon
Bus interface
I/O bus
USB
controller
mouse keyboard
Graphics
adapter
Disk
controller
Monitor
Disk
39
Carnegie Mellon
Bus interface
I/O bus
USB
controller
Graphics
adapter
Mouse Keyboard
Monitor
Disk
controller
Disk
40
Carnegie Mellon
Bus interface
I/O bus
USB
controller
Graphics
adapter
Mouse Keyboard
Monitor
Disk
controller
Disk
41
Carnegie Mellon
Flash
translation layer
Flash memory
Block 0
Page 0
Page 1
Block B-1
Page P-1
Page 0
Page 1
Page P-1
Carnegie Mellon
250
MB/s
140
MB/s
30
us
170
MB/s
14
MB/s
300
us
43
Carnegie Mellon
Advantages
No
moving
parts
faster,
less
power,
more
rugged
Disadvantages
Have
the
poten0al
to
wear
out
Mi0gated
by
wear
leveling
logic
in
ash
transla0on
layer
E.g.
Intel
X25
guarantees
1
petabyte
(1015
bytes)
of
random
writes
before
they
wear
out
In
2010,
about
100
0mes
more
expensive
per
byte
ApplicaAons
MP3
players,
smart
phones,
laptops
Beginning
to
appear
in
desktops
and
servers
44
Carnegie Mellon
Storage
Trends
SRAM
Metric
1980
1985
1990
1995
2000
2005
2010
2010:1980
$/MB
access (ns)
19,200
300
2,900
150
320
35
256
15
100
3
75
2
60
1.5
320
200
Metric
1980
1985
1990
1995
2000
2005
2010
2010:1980
$/MB
access (ns)
typical size (MB)
8,000
375
0.064
880
200
0.256
100
100
4
30
70
16
1
60
64
0.1
50
2,000
0.06
40
8,000
130,000
9
125,000
Metric
1980
1985
1990
1995
2000
2005
2010
2010:1980
$/MB
access (ms)
typical size (MB)
500
87
1
100
75
10
8
28
160
0.30
10
1,000
0.01
8
20,000
0.005
0.0003 1,600,000
4
3
29
160,000 1,500,000 1,500,000
DRAM
Disk
45
Carnegie Mellon
2003
2005
2010
2010:1980
Pentium P-III
P-4
Core 2
Core i7
---
20
150
600
3300
2000
2500
2500
Cycle
time (ns) 1000
50
1.6
0.3
0.50
0.4
2500
Cores
50
1.6
0.3
0.25
0.1
10,000
CPU
Clock
rate (MHz)
1980
1990
1995
8080
386
Effective
cycle
1000
time (ns)
2000
46
Carnegie Mellon
Disk
10,000,000.0
1,000,000.0
SSD
100,000.0
ns
10,000.0
1,000.0
DRAM
100.0
10.0
1.0
CPU
0.1
0.0
1980
1985
1990
1995
2000
Year
2003
2005
2010
47
Carnegie Mellon
48
Carnegie Mellon
Today
49
Carnegie Mellon
Locality
SpaAal
locality:
Items
with
nearby
addresses
tend
50
Carnegie Mellon
Locality
Example
sum = 0;
for (i = 0; i < n; i++)
sum += a[i];
return sum;
Data
references
Reference
array
elements
in
succession
(stride-1
reference
pahern).
Reference
variable
sum
each
itera0on.
SpaAal
locality
Temporal
locality
InstrucAon
references
Reference
instruc0ons
in
sequence.
Cycle
through
loop
repeatedly.
SpaAal
locality
Temporal
locality
51
Carnegie Mellon
Carnegie Mellon
Locality Example
53
Carnegie Mellon
Locality Example
54
Carnegie Mellon
Memory Hierarchies
Carnegie Mellon
Today
56
Carnegie Mellon
L2:
L3:
Larger,
slower,
cheaper
per
byte
L5:
L4:
L1
cache
(SRAM)
L2
cache
(SRAM)
Main
memory
(DRAM)
Local
secondary
storage
(local
disks)
57
Carnegie Mellon
Caches
Carnegie Mellon
Cache
84
10
4
Memory
10
14
10
11
12
13
14
15
59
Carnegie Mellon
Cache
14
Memory
10
11
12
13
14
15
60
Carnegie Mellon
Cache
9
12
Request: 12
12
Memory
14
10
11
12
13
14
15
61
Carnegie Mellon
Conict
miss
Most
caches
limit
blocks
at
level
k+1
to
a
small
subset
(some0mes
a
Capacity
miss
Occurs
when
the
set
of
ac0ve
cache
blocks
(working
set)
is
larger
than
the
cache.
62
Carnegie Mellon
What is Cached?
Registers
CPU core
TLB
0 Hardware
L1 cache
64-bytes block
On-Chip L1
1 Hardware
L2 cache
64-bytes block
On/O-Chip L2
Virtual Memory
4-KB page
Main memory
100 Hardware + OS
Buer cache
Parts of les
Main memory
100 OS
Disk cache
Disk sectors
Disk controller
Network
buer
cache
Parts of les
Local disk
Browser cache
Web pages
Local disk
Web cache
Web pages
0 Compiler
10 Hardware
Carnegie Mellon
Summary
64