Вы находитесь на странице: 1из 93

COE 381 MICROPROCESSORS

UNIT 2 MEMORIES

Chapter Objectives
Stored Program Concept Addressing

Address Decoding Using Memory Chips

Commodity Memories
Timing

Chapter Objectives
Reality of Memory Decoding Filling the Memory Map

Memory Map Details


Endianness

Separate I/O Address Space Memory Hierarchy

Chapter Objectives
Caches
Read Only Memory (ROM) Punch Cards Harvard Architecture Current Memory Technology ROM / PROM

Chapter Objectives
More Electrostatic Memories
More Magnetic Memories Full Address Decoding

Magnetic Memories

Optical Memories Partial Address Decoding

Chapter Objectives
Designing Address Decoders
Address Decoding with Random Logic Address Decoding with m-line-to-nline Decoders

Address Decoding with PROM

Address Decoding with FPGA, PLA and PAL

SECTION 1:

MEMORY

The Stored Program Concept


The concept of a stored program is attributed to John von Neumann.
Put simply it says: Instructions can be represented by numbers and stored in the same way as data.

Thus a bit pattern 01000001 could either represent the number 65 or a JUMP instruction.

The Stored Program Concept cont


Whilst it is rare that the same memory locations are used as instructions and data it does happen. Eg. When a program is loaded and executed
Question: what is the difference between the von Neumann architecture and the stored program concept

Determining Addressable Memory


A processor with a 12-bit address bus can address up to 4Kwords of memory. ARM produces byte addresses and has a 32bit address space, which allows the addressing of 232 separate bytes.

Commodity Memories
All von Neumann computers need memory (some small others large) Small memories (a few Kbytes) are often on-chip with processors, etc. Large memories could be in one or more modules Several types of memory exist (cost trade-offs vary according to the system requirement)

Memory Chips

The memory device shown is a 628512. This is a 4Mbit SRAM chip organized as 512 Kwords of 8 bits each. It therefore requires nineteen address lines and eight data lines.

Memory Chips (contd)


The following table defines the memory chips behaviour.

Points to note:
All the control signals are active low

If the chip is not selected (/CS = H), nothing happens Write enable overrides read operations The data bus is bidirectional (either read or write saves pins)

Timing Issues
When writing to memory it is important that the correct data is written to the correct location; it is also important to ensure that no other memory locations are corrupted. It is important that the address is stable during the write operation; if it is not, other locations may also be affected. (see timing diagram)

Timing Issues (2)


The actual write strobe is a logical AND of the write enable and chip select signals; both must be active for data to be written. The timing diagram shown is therefore only one possible approach to strobing the memory.

Timing (contd)

The timing diagram

Timing cont(2)
Different processors (& different implementations) encode timing differently. This is okay, as long as timing is included somewhere.

Addressing
Some definitions: Byte now standardized as eight bits. Word the natural size of operands, which varies from processor to processor (16 bits in MU0, 32 bits in ARM). Usually the width of the data bus. Nibble four bits or half a byte

Definitions cont
Width the number of bits in a bus or a register Address range the number of elements which can be addressed.

Type what the data represents. This is really a software concept in that the hardware (usually) does not care whether a word is to be interpreted as an instruction, an integer, a float, an address . This may, however, influence the size of the transfer.

Addressing
Within the CPU it is common for several things to happen in parallel; The memory only performs one operation at once. This operation requires the answers to the questions:
Do what? Control (read or write) With what? Data Where? Address

Addressing cont
Because only one operation is happening at a time the control signals and the data bus can be shared over the whole memory. The address bus provides a code to specify which location is being used (addressed).

Commodity Memories
D-type flip-flops
Convenient for synchronous logic (e.g. FSMs) Very large area per bit

Transparent latches
Okay for logic but not as convenient
Smaller than D-types, but still large

Commodity Memories (2)


SRAM
Small area per bit

Need (shareable) interface logic Simple to use

DRAM
Very small area per bit

Need considerable interface logic Many awkward timing constraints

Memory Decoding
In reality a memory address may not always refer to one memory location. For example an ARM processor can address memory in 32-bit words or 8-bit bytes (or 16bit halfwords) and the memory system must be able to support all access sizes.

Memory Decoding (2)


Addresses are decoded to the minimum addressable size (in this case bytes).

Thus the least significant bit used by the address decoder is A[2];
A[1] and A[0] act as byte selects, which will be ignored when performing word-wide operations.

Memory Decoding (3)


Bus addressing is normally written in the format N[X,Y] Notice that when the processor reads word 00000000 it receives data on all its data lines (D[31:0]).

Memory Decoding (4)


When the processor reads byte 00000000 it receives data only on one quarter of the data bus (D[7:0]). Evidently, the last two bit positions determine the subset of the data bus which should be used.

Filling the Memory Map


The ARM processor
has a 32-bit word length.

produces a 32-bit byte address. can perform read and write operations with 32-, 16- and 8-bit data.

Filling the Memory Map cont


The normal design for the memory system would therefore be a space of 230 words (byte addressing, remember) of 32-bits each. This could be populated, using the 128K8-bit RAM chips. Four RAMs (i.e. 512Kbytes)= 128KWords A total of 230/217 = 213 =8192 RAM chips required.

Separate I/O Space


I/O access patterns different from memory accesses
I/O access being rarer

Separate address space for I/O (e.g. x86 architecture)

Cleaner address space left just for true memory I/O space referenced with different instructions (e.g. IN and OUT )
limited addressing modes and, possibly, a smaller address range

Same bus (with an added address line IO/mem)

Endianness
Refers to the way sub-elements are numbered within an element, for example the way bytes are numbered in a word. Two types Little endian and Big endian By convention the bytes-in-a-word definition tends to dominate, thus a big-endian processor will typically still number its bits in a little-endian fashion .

Little Endian Addressing


The LSB is at the lowest address. Pick a word address; say 00001000, in a 32-bit byte-addressable address space. Lets store a word (say, 12345678) at this address.
Address 1000 contains byte 78 Address 1001 contains byte 56 Address 1002 contains byte 34 address 1003 contains byte 12

Little Endian Addressing cont


This has the effect that, if displayed as bytes, a memory dump would look like: 00001000 78 56 34 12 If a byte load was performed on the same address the result would be: 00000078

Big Endian Addressing


The MSB is at the lowest address.
Using the same word address (00001000) for the same word (12345678).
Address 1000 contains byte 12 Address 1001 contains byte 34 Address 1002 contains byte 56 Address 1003 contains byte 78

Big Endian Addressing cont


This has the effect that, if displayed as bytes, a memory dump would look like: 00001000 12 34 56 78

If a byte load was performed on the same address the result would be: 00000012 NB: Choice of endianness in a given processor is arbitrary.

Memory Hierarchy
For a given price
big memory = slow memory small memory = fast memory

If a programme has to run from main memory it will only run at the speed at which its instructions can be read.

Memory Hierarchy cont


In reality typical programmes show a great deal of locality If the critical 10% of the code is placed in a small, fast memory then the performance of the overall programme can be significantly increased.

Depending on the implementation it may be known as caching or virtual memory.

Caches
Two observations:
Large memories (at an economical price) tend to be slower than small ones. A program spends 90% of its time using 10% of the available address space.

Memory need not be homogenous

Caches cont
If you can organize things so that the most used address space is in fast memory, then you can get startling improvements at relatively small cost. This is sometimes manually possible. Eg. Embedded controllers where software is fixed.

Caches cont(2)
In general purpose machines (e.g. PCs) the code is dynamic . A cache memory adapts itself to prevailing conditions by allowing the addresses it occupies to change as the program runs. It relies on
Spatial Locality guessing that if an address is used others nearby are likely to be wanted. Temporal Locality guessing that if an address has been used it is likely to be used again in the near future.

Cache Hierarchies
Caches work so well that it is now common practice to have a cache of the cache. This introduces several levels of cache or a cache hierarchy. The first level (or L1) cache will be integrated with the processor silicon (onchip).

Caches Hierarchies cont


There will be a second level of the cache (L2); this may be on the PCB or on the CPU Further cache levels are also possible; L3 is increasingly common in high-performance systems.

Harvard Architecture
Normally refers to stored program computers with separate instruction and data buses. This separation may apply to the entire memory architecture or may be limited to the cache architecture . (see next two slide)

The Harvard architecture logically separates the fetching of instructions from data reads and writes.

Harvard Architecture cont


However its real purpose is to increase memory bandwidth The disadvantages of Harvard architecture are:
the available memory is pre-divided into code and data areas; in a von Neumann machine the memory can be allocated differently according to the needs of a particular program it is hard/impossible for the code to modify itself (not often a problem, but can make loading programs difficult!) more wiring (pins, etc.)

You might like to identify and label the buses here. Where should the I/O be in each?

Memories
RAM Random Access Memory (by convention used for memory which is readable and writeable)
RAM forgets when the power is turned off

The address space of a computer will normally contain RAM ROM Read Only Memory (cannot be written to)
Used to hold fixed programs

Memory Technology- Static RAM (SRAM)


Fast Truly random access Relatively expensive per bit simple interface Reasonable storage density Consumes little power when not in use Typical application: cache (speed), memory on a mobile phone (low power)

Memory Technology- Dynamic RAM (DRAM)


Significantly slower than a fast processor Faster if addressed in bursts of addresses Medium cost per bit Complex interface forgets over time (in order of milliseconds)
needs to be constantly read and rewritten (refresh) consumes power even when idle

Its big advantage is that it gives very dense storage Typical application: main memory of a PC (large, costeffective)

Memory Technology- ROM


Mask programmed ROMs are programmed during chip manufacture
Cheap for large quantities Used in ASIC1 applications

Memory Technology- ROM


PROMs are Programmable after manufacture, using programming equipment.
Each individual IC is separately programmed (a manual operation). Contents cannot be changed after programming.

Memory Technology- ROM


EPROMs are Erasable and Programmable (usually by exposure to strong ultra-violet light) (1970s-1990s) A technology in decline

Memory Technology- ROM


EEPROMs are Electrically Erasable. (1990s to date) One of the most popular ROM technologies Can be altered in-circuit

Require much more time to alter a location (writes take >100x the read time) than RAM

Memory Technology- ROM (2)


Some require bulk erasure Widely used for non-volatile store in consumer applications such as telephones, TV remote controls, digital cameras et al.

Flash Memory falls into this category

Memory Technology- ROM (3)


Reprogrammable devices suffer a small but cumulative amount of damage each time they are erased/reprogrammed Are only guaranteed for a limited number (say, 100 000) of write operations

Memory Technology- Magnetic Storage


Very slow (compared to processor speeds) Variable in their access times (think of the mechanics involved) Read/writeable only in blocks Cheap per bit

Memory Technology- Optical storage


Very slow (compared to processor speeds) Variable in their access times (think of the mechanics involved) Primarily (but not exclusively) read only Extremely cheap per bit Cheap to make as ROMs (think CDs, DVDs, )

SECTION 2

ADDRESS DECODING STRATEGIES

Address Decoding
Although memory space is said to be flat, it does not mean the physical implementation is homogenous

Different portions of memory are used for different purposes: RAM, ROM, I/O Even if all the memory was of one type, we still have to implement it using multiple ICs

Address Decoding
This means that for a given valid address, one and only one memory-mapped component must be accessed Address decoding is the process of generating chip select (CS*) signals from the address bus for each device in the system

Arrangement of 2KB Memory Blocks

The address bus lines are split into two sections the N most significant bits are used to generate the CS* signals for the different devices the M least significant signals are passed to the devices as addresses to the different memory cells or internal registers

Decoding Logic for M1 and M2


000 000 000 7FF 000 800 000 FFF 001 000

Address decoding methods


There are two types of address decoding: Full address decoding Partial address decoding

Full address decoding


All the address lines are used to specify a memory location Each physical memory location is identified by a unique address

Recall

An Example using Binary Decoder


Lets assume a very simple microprocessor with 10 address lines (1KB memory)
Lets assume we wish to implement all its memory space and we use 128x8 memory chips

Solution
We will need 8 memory chips (8x128=1024) We will need 3 address lines to select each one of the 8 chips Each chip will need 7 address lines to address its internal memory cells With this example, all the address space was implemented. However this might not always be the case.

An Example using Random Logic


Lets assume the same microprocessor with 10 address lines (1KB memory) However, this time we wish to implement only 512 bytes of memory We still must use 128-byte memory chips Physical memory must be placed on the upper half of the memory map

Solution using Decoding Table

A More Difficult Example


Device Description Device Name Amount of Memory to Address . 4KB 4KB 8KB 2 bytes 2 bytes

ROM chip RAM chip ROM chip Peripheral 1 Peripheral 2

ROM1 RAM ROM2 PERI1 PERI2

Device Memory Mapping


Device Name ROM1 RAM ROM2 PERI1 PERI2 Start Address $000000 $001000 $002000 $004000 $004002 End Address $000FFF $001FFF $003FFF $004001 $004003

Address Decoding Table


DEVICE ADDRESS LINE 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 ROM1 RAM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 X X X X X X X X X X X X X X X X X X X X X

0 1 X X X

ROM2
PERI1 PERI2

0 0 0 0
0 0 0 0 0 0 0 0

0
0 0

0 0 0 0 0
0 0 0 0 1 0 0 0 0 1

1 X X
0 0 0 0 0 0

X X X X
0 0 0 0 0 0 0 0

X X
0 0 0 0

X X
0 0 0 0

X X
0 0 0 1

X
X X

Full Address Decoding Schematic Diagram Corresponding to Table

Partial address decoding


Since not all the address space is implemented, only a subset of the address lines are needed to point to the physical memory locations Each physical memory location is identified by several possible addresses (using all combinations of the address lines that were not used)

Example
Lets assume the same microprocessor with 10 address lines (1KB memory) However, this time we wish to implement only 512 bytes of memory We still must use 128-byte memory chips Physical memory must be placed on the upper half of the memory map

Solution

Partial Address Decoding

Memory Map

Address Decoding Table for Partial Address Decoding


DEVICE ADDRESS LINE 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 0 0 0 0 0 1 0 1 1 0 1 1 X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X

ROM1 RAM ROM2 PERI1 PERI2

X X X X

Improved Partial Address Decoding Scheme


DEVICE A23 ROM1 0 A22 0 A21 0 A20 0

RAM ROM2 PERI1 PERI2 SPACE

0 0 0 0 1

0 0 1 1

0 1 0 1

Partial Address Decoding Schematic Diagram Corresponding to Table

Designing Address Decoders


Address Decoding with Random Logic Address Decoding with m-line-to-n-line Decoders Address Decoding with PROM Address Decoding with FPGA, PLA and PAL

m-line-to-n-line Decoders (e.g. 74LS138 Decoder)

Applications of the three-to-eight Decoder

Also, recall

DEVICE

MEMORY SPACE (BYTES)

ADDRESS RANGE

Address Decoding with PROM

ROM1
ROM2

4K
4K

000000 000FFF
001000 001FFF

ROM3
RAM

4K
2K

002000 002FFF
00C000 00C7FF

PERI1
PERI2 PERI3

256
256 256

00E000 00E0FF
00E100 00E1FF 00E200 00E2FF

PROM-based Address Decoder Implementation

Programming of the Address Decoding PROM

System Address Lines Address range of A15 A14 A13 A12 A11 CPU PROM Address Input A4 A3 A2 A1 A0

System Device Enables PROM1 PROM2 PROM3 RAM1 PERIs PROM Data Output D7 D6 D5 D4 D3 D2 D1 D0

000000-0007FF 0
000800-000FFF 0 001000-0017FF 0 001800-001FFF 0 002000-0027FF 0 002800-002FFF 0 003000-0037FF 0 003800-003FFF 0 ----------

0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

0
0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1

0
0 1 1 0 0 1 1 1 0 0 1 1 0 0 1 1

0
1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 1

0
0 1 1 1 1 1 1 -do1 1 1 1 1 1 1 1 1

1
1 0 0 1 1 1 1 -do1 1 1 1 1 1 1 1 1

1
1 1 1 0 0 1 1 -do1 1 1 1 1 1 1 1 1

1
1 1 1 1 1 1 1 -do1 0 1 1 1 1 1 1 1

1
1 1 1 1 1 1 1 -do1 1 1 1 1 0 1 1 1

1
1 1 1 1 1 1 1

1
1 1 1 1 1 1 1

1
1 1 1 1 1 1 1

-do- -do- -do1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

00B800-00BFFF 1 00C000-00C7FF 1 00C800-00CFFF 1 00D000-00D7FF 1 00D800-00DFFF 1 00E000-00E7FF 1 00E800-00EFFF 00F000-00F7FF 00F800-00FFFF 1 1 1

Address Decoding with FPGA, PLA and PAL


Address decoding using general purpose programmable logic elements
the speed of random logic, and the flexibility of the PROM

Вам также может понравиться