Вы находитесь на странице: 1из 37

Chapter 6

Input-Output Organization
INTRODUCTION

In computing, input/output, or I/O, refers to the communication between an information processing


system (such as a computer), and the outside world – possibly a human, or another information
processing system. Inputs are the signals or data received by the system, and outputs are the signals or
data sent from it. The term can also be used as part of an action; to "perform I/O" is to perform an input
or output operation. I/O devices are used by a person (or other system) to communicate with a
computer. For instance, keyboards and mouses are considered input devices of a computer, while
monitors and printers are considered output devices of a computer. Devices for communication
between computers, such as modems and network cards, typically serve for both input and output.

Note that the designation of a device as either input or output depends on the perspective. Mouses and
keyboards take as input physical movement that the human user outputs and convert it into signals that
a computer can understand. The output from these devices is input for the computer. Similarly, printers
and monitors take as input signals that a computer outputs. They then convert these signals into
representations that human users can see or read. (For a human user the process of reading or seeing
these representations is receiving input.)

In computer architecture, the combination of the CPU and main memory (i.e. memory that the CPU
can read and write to directly, with individual instructions) is considered the brain of a computer, and
from that point of view any transfer of information from or to that combination, for example to or from
a disk drive, is considered I/O. The CPU and its supporting circuitry provide memory-mapped I/O that
is used in low-level computer programming in the implementation of device drivers.

INPUT-OUTPUT ORGANIZATION

• Peripheral Devices
• Input-Output Interface
• Asynchronous Data Transfer
• Modes of Transfer
• Priority Interrupt
• Direct Memory Access
• Input-Output Processor
• Serial Communication

6.1 PERIPHERAL DEVICES

6.1.1 Keyboard

A keyboard is a human interface device which is represented as a layout of buttons. Each button, or
key, can be used to either input a linguistic character to a computer, or to call upon a particular
function of the computer. Traditional keyboards use spring-based buttons, though newer variations
employ virtual keys, or even projected keyboards.

72
Examples of types of keyboards include:

Computer keyboard
Keyer
Chorded keyboard
LPFK

6.1.2. Pointing devices

A pointing device is any human interface device that allows a user to input spatial data to a
computer. In the case of mice and touch screens, this is usually achieved by detecting movement across
a physical surface. Analog devices, such as 3D mice, joysticks, or pointing sticks, function by
reporting their angle of deflection. Movements of the pointing device are echoed on the screen by
movements of the cursor, creating a simple, intuitive way to navigate a computer's GUI.

6.1.3. High-degree of freedom input devices

Some devices allow many continuous degrees of freedom as input. These can be used as pointing
devices, but are generally used in ways that don't involve pointing to a location in space, such as the
control of a camera angle while in 3D applications. These kinds of devices are typically used in
CAVEs, where input that registers 6DOF is

6.1.4. Imaging and Video input devices

Video input devices are used to digitize images or video from the outside world into the computer. The
information can be stored in a multitude of formats depending on the user's requirement.

Webcam
Image scanner
Fingerprint scanner
Barcode reader
3D scanner
Laser rangefinder

6.1.5. Medical Imaging

o Computed tomography
o Magnetic resonance imaging
o Positron emission tomography
o Medical ultrasonography

6.2 Output Devices

An output device is any piece of computer hardware equipment used to communicate the results
of data processing carried out by an information processing system (such as a computer) to the
outside world. In computing, input/output, or I/O, refers to the communication between an
information processing system (such as a computer), and the outside world. Inputs are the signals
or data sent to the system, and outputs are the signals or data sent by the system to the outside.

73
The most common input devices used by the computer are the keyboard and mouse. The keyboard
allows the entry of textual information while the mouse allows the selection of a point on the
screen by moving a screen cursor to the point and pressing a mouse button. The most common
outputs are monitors and speakers

o Card Puncher, Paper Tape Puncher


o CRT
o Printer (Impact, Ink Jet, Laser, Dot Matrix) Plotter
o Analog
o Voice

6.3 INPUT/OUTPUT INTERFACE

• Provides a method for transferring information between internal storage (such as memory and
CPU registers) and external I/O devices
• Resolves the differences between the computer and peripheral devices
– Peripherals - Electromechanical Devices
– CPU or Memory - Electronic Device
– Data Transfer Rate
» Peripherals - Usually slower
» CPU or Memory - Usually faster than peripherals
• Some kinds of Synchronization mechanism may be needed
– Unit of Information
» Peripherals – Byte, Block, …
» CPU or Memory – Word
– Data representations may differ

6.3.1 BUS AND INTERFACE MODULES

Each peripheral has an interface module associated with it Interface.

- Decodes the device address (device code)

- Decodes the commands (operation)

74
- Provides signals for the peripheral controller

- Synchronizes the data flow and supervises

the transfer rate between peripheral and CPU or Memory

Typical I/O instruction


Op. Device Function code
code address

Commands

6.3.2. CONNECTION OF I/O BUS

6.3.2.1 Connection of I/O Bus to CPU

6.3.2.2.Connection of I/O to One interface

6.3.4. I/O BUS AND MEMORY BUS

Functions of Buses

MEMORY BUS is for information transfers between CPU and the MM

* I/O BUS is for information transfers between CPU and I/O devices through their I/O interface

* Many computers use a common single bus system for both memory and I/O interface units

75
- Use one common bus but separate control lines for each function

- Use one common bus with common control lines for both functions

* Some computer systems use two separate buses, one to communicate with memory and the other
with I/O interfaces

- Communication between CPU and all interface units is via a common I/O Bus

- An interface connected to a peripheral device may have a number of data registers , a control
register, and a status register

- A command is passed to the peripheral by sending to the appropriate interface register

- Function code and sense lines are not needed (Transfer of data, control, and status information is
always via the common I/O Bus)

6.3.4. ISOLATED vs MEMORY MAPPED I/O

Isolated I/O

Separate I/O read/write control lines in addition to memory read/write control lines.

- Separate (isolated) memory and I/O address spaces.

- Distinct input and output instructions.

Memory-mapped I/O

A single set of read/write control lines(no distinction between memory and I/O transfer)

- Memory and I/O addresses share the common address space

-> reduces memory address range available

- No specific input or output instruction

-> The same memory reference instructions can be used for I/O transfers

- Considerable flexibility in handling I/O operations

6.3.5. I/O INTERFACE

76
6.3.6. Programmable Interface

- Information in each port can be assigned a meaning depending on the mode of operation of the I/O
device. → Port A = Data; Port B = Command; Port C = Status

- CPU initializes(loads) each port by transferring a byte to the Control Register .

→ Allows CPU can define the mode of operation of each port.

→ Programmable Port: By changing the bits in the control register, it is possible to change
the interface characteristics.

6.4 ASYNCHRONOUS DATA TRANSFER

6.4.1. Synchronous and Asynchronous Operations

Synchronous - All devices derive the timing information from common clock line.

Asynchronous - No common clock.

6.4.2 Asynchronous Data Transfer

Asynchronous data transfer between two independent units requires that control signals be transmitted
between the communicating units to indicate the time at which data is being transmitted.

Two Asynchronous Data Transfer Methods

Strobe pulse : A strobe pulse is supplied by one unit to indicate the other unit when the transfer has to
occur.

Handshaking: A control signal is accompanied with each data being transmitted to indicate the
presence of data .The receiving unit responds with another control signal to acknowledge receipt of
the data.

77
6.4.2.1 STROBE CONTROL

* Employs a single control line to time each transfer

* The strobe may be activated by either the source or the destination unit

Source-Initiated Strobe for Data Transfer Destination-Initiated Strobe for Data


Transfer

6.4.2.2. HANDSHAKING

Strobe Methods

1. Source-Initiated

The source unit that initiates the transfer has no way of knowing whether the destination unit
has actually received data

2. Destination-Initiated

The destination unit that initiates the transfer no way of knowing whether the source has actually
placed the data on the bus.To solve this problem, the HANDSHAKE method introduces a second
control signal to provide a Reply to the unit that initiates the transfer.

SOURCE-INITIATED TRANSFER USING HANDSHAKE

78
* Allows arbitrary delays from one state to the next

* Permits each unit to respond at its own data transfer rate

* The rate of transfer is determined by the slower unit

DESTINATION-INITIATED TRANSFER USING HANDSHAKE

6.4. ASYNCHRONOUS SERIAL TRANSFER

Four Different Types of Transfer

Asynchronous serial transfer

Synchronous serial transfer

Asynchronous parallel transfer

Synchronous parallel transfer

Asynchronous Serial Transfer

- Employs special bits which are inserted at both ends of the character code .

- Each character consists of three parts; Start bit; Data bits; Stop bits.

A character can be detected by the receiver from the knowledge of 4 rules;

When data are not being sent, the line is kept in the 1-state (idle state)

- The initiation of a character transmission is detected by a Start Bit , which is always a 0.


79
- The character bits always follow the Start Bit.

- After the last character , a Stop Bit is detected when the line returns to the 1-state for at least 1 bit
time.

The receiver knows in advance the transfer rate of the bits and the number of information bits to
expect.

UNIVERSAL ASYNCHRONOUS RECEIVER-TRANSMITTER


- UART –

Transmitter Register

- Accepts a data byte(from CPU) through the data bus

- Transferred to a shift register for serial transmission

Receiver

- Receives serial information into another shift register

- Complete data byte is sent to the receiver register

Status Register Bits

- Used for I/O flags and for recording errors

Control Register Bits.

Define baud rate, no. of bits in each character, whether to generate and check parity, and no. of stop
bits.

FIRST-IN-FIRST-OUT(FIFO) BUFFER

* Input data and output data at two different rates

80
* Output data are always in the same order in which the data entered the buffer.

* Useful in some applications when data is transferred asynchronously

4 x 4 FIFO Buffer (4 4-bit registers Ri),

4 Control Registers(flip-flops Fi, associated with each

MODES OF TRANSFER - PROGRAM-CONTROLLED I/O -

3 different Data Transfer Modes between the central computer (CPU or Memory) and peripherals;

1. Program-Controlled I/O

2. Interrupt-Initiated I/O

3. Direct Memory Access (DMA)

Program-Controlled I/O(Input Dev to CPU)

81
MODES OF TRANSFER - INTERRUPT INITIATED I/O & DMA

- Polling takes valuable CPU time.

- Open communication only when some data has to be passed -> Interrupt.

- I/O interface, instead of the CPU, monitors the I/O device.

- When the interface determines that the I/O device is ready for data transfer, it generates an Interrupt
Request to the CPU.

- Upon detecting an interrupt, CPU stops momentarily the task it is doing, branches to the service
routine to process the data transfer, and then returns to the task it was performing.

DMA (Direct Memory Access)

- Large blocks of data transferred at a high speed to or from high speed devices, magnetic drums,
disks, tapes, etc.

- DMA controller : Interface that provides I/O transfer of data directly to and from the memory and
the I/O device.

- CPU initializes the DMA controller by sending a memory address and the number of words to be
transferred.

- Actual transfer of data is done directly between the device and memory through DMA controller.

-> Freeing CPU for other tasks

82
PRIORITY INTERRUPT

Priority

- Determines which interrupt is to be served first when two or more requests are made
simultaneously. And also determines which interrupts are permitted to interrupt the computer while
another is being serviced.

- Higher priority interrupts can make requests while servicing a lower priority interrupt .

Priority Interrupt by Software(Polling)

- Priority is established by the order of polling the devices(interrupt sources) and flexible since it is
established by software.

- Low cost since it needs a very little hardware.

- Very slow .

Priority Interrupt by Hardware

Require a priority interrupt manager which accepts all the interrupt requests to determine the
highest priority request . Fast since identification of the highest priority interrupt request is
identified by the hardware. Fast since each interrupt source has its own interrupt vector to access
directly to its own service routine .

HARDWARE PRIORITY INTERRUPT - DAISY-CHAIN -

Interrupt Request from any device(>=1)

-> CPU responds by INTACK <- 1.

-> Any device receives signal (INTACK) 1 at PI puts the VAD on the bus.

Among interrupt requesting devices the only device which is physically closest to CPU gets
INTACK=1, and it blocks INTACK to propagate to the next device.

83
One stage of the daisy chain priority arrangement

PARALLEL PRIORITY INTERRUPT

IEN: Set or Clear by instructions ION or IOF.

IST: Represents an unmasked interrupt has occurred. INTACK enables tristate Bus Buffer to load
VAD generated by the Priority Logic.

Interrupt Register:

- Each bit is associated with an Interrupt Request from different Interrupt Source - different priority
level.

- Each bit can be cleared by a program instruction.

Mask Register:

- Mask Register is associated with Interrupt Register.

- Each bit can be set or cleared by an Instruction.

84
INTERRUPT PRIORITY ENCODER

Determines the highest priority interrupt when more than one interrupts take place.

Priority Encoder Truth table

INTERRUPT CYCLE

At the end of each Instruction cycle :

- CPU checks IEN and IST.

- If IEN IST = 1, CPU -> Interrupt Cycle.

SP SP - 1 Decrement stack pointer.

M[SP] PC Push PC into stack.

INTACK 1 Enable interrupt acknowledge.

PC VAD Transfer vector address to PC.

IEN 0 Disable further interrupts.

Go To Fetch to execute the first instruction in the interrupt service routine.

INTERRUPT SERVICE ROUTINE

85
Initial and Final Operations

Each interrupt service routine must have an initial and final set of operations for controlling the
registers in the hardware interrupt system.

Initial and Final Operations

6.5 DIRECT MEMORY ACCESS

* Block of data transfer from high speed devices, Drum, Disk, Tape.

* DMA controller - Interface which allows I/O transfer directly between Memory and Device, freeing
CPU for other tasks.

* CPU initializes DMA Controller by sending memory address and the block size (number of words) .

CPU bus signals for DMA transfer

Block diagram of DMA controller

6.5.1. DMA I/O OPERATION

Starting an I/O

86
- CPU executes instruction to

a. Load Memory Address Register

b. Load Word Counter

c. Load Function (Read or Write) to be performed

d. Issue a GO command

Upon receiving, a GO Command DMA performs I/O operation as follows independently from CPU.

[1] Input Device <- R (Read control signal).

[2] Buffer (DMA Controller) <- Input Byte;and assembles the byte into a word until word is full.

[4] M <- memory address, W(Write control signal).

[5] Address Reg <- Address Reg +1; WC(Word Counter) <- WC – 1.

[6] If WC = 0, then Interrupt to acknowledge done, else go to [1] .

Output

[1] M <- M Address, R, M Address R <- M Address R + 1, WC <- WC – 1.

[2] Disassemble the word .

[3] Buffer <- One byte; Output Device <- W, for all disassembled bytes .

[4] If WC = 0, then Interrupt to acknowledge done, else go to [1].

6.5.2. CYCLE STEALING

While DMA I/O takes place, CPU is also executing instructions DMA Controller and CPU both access
Memory -> Memory Access Conflict .

Memory Bus Controller

- Coordinating the activities of all devices requesting memory access

- Priority System

Memory accesses by CPU and DMA Controller are interwoven, with the top priority given to DMA
Controller

-> Cycle Stealing

87
Cycle Steal

- CPU is usually much faster than I/O(DMA), thus CPU uses the most of the memory
cycles.

- DMA Controller steals the memory cycles from CPU.

- For those stolen cycles, CPU remains idle.

- For those slow CPU, DMA Controller may steal most of the memory cycles which may cause
CPU remain idle long time .

DMA TRANSFER

6.6 PRIORITY INPUT/OUTPUT PROCESSOR

Channel

- Processor with direct memory access capability that communicates with I/O devices.

- Channel accesses memory by cycle stealing.

- Channel can execute a Channel Program.

i. Stored in the main memory.

ii. Consists of Channel Command Word(CCW).

iii. Each CCW specifies the parameters needed by the channel to control the I/O
devices and perform data transfer operations.

88
- CPU initiates the channel by executing an channel I/O class instruction and once initiated,
channel operates independently of the CPU .

CHANNEL / CPU COMMUNICATION

89
Multiple choice Questions

1. In Programmable interface each port contains

a).Port A=data ; Port B= Command ; Port C=status ;

b).Port A=data ; Port B= status ; Port C =Command ;

c).Port A=Command ; Port B= Data ; Port=status ;

2._________ types of Asynchronous Data Transfer methods are

a). 3, strobe , stroke and Handshaking

b).2, strobe and handshaking

c). 1 , handshaking

3. Four different types of asynchronous serial transfer

a).Asynchronous serial and parallel transfer

b). synchronous serial and asynchronous parallel transfer

c).Asynchronous serial, synchronous parallel Asynchronous parallel, synchronous serial.

4.Status register is used for

a). I/O flag and to receive information

b).I/O flag and for recording errors.

c).for recording errors and to receive information

5 Which one is incorrect

a).CPU is usually much faster than I/O(DMA) .

b). DMA Controller steals the memory cycles from CPU.

c).For those stolen cycles, CPU remains idle.

d). For those slow CPU, DMA Controller may steal most of the memory cycles which may
cause CPU remain idle for few seconds .
90
6. Transmitter register

a).Accepts a data byte through data bus.

b). Accepts a data byte through memory bus.

c). Accepts a data byte through address bus.

7.The larger the RAM of computer the faster its processing speed is, since it eliminates.
a). need for external memory b). need for ROM

c).frequent disk I/Os. d). need for wider data path

8. A group of signal lines used to transmit data in parallel from one element of a computer to
another is
a).Control Bus b).Address Bus

c). Databus d). Network

9. The basic unit within a computer store capable of holding a single unit of Data is

a). register b).ALU

c).Control unit d). store location

10.A device used to bring information into a computer is

a).ALU b).Input device

c).Control unit d).Output device

Answers

1.a, 2.b,3.c,4.b,5.d,6.a, 7.b,8.c,9.b,10.d

91
Chapter 7

Memory Organization

7.1 Memory Hierarchy

The memory unit is an essential components in any digital computer since it is needed for strong
progress and data. Most general purpose computer would run more efficiently if they were equipped
with additional storage

device beyond the capacity of main memory.The main memory unit that communicates directly with
CPU is called the MAIN MEMORY . Devices that provide backup storage are called AUXILARY
MEMORY. Most common auxiliary devices are magnetic disks and tapes they are used for strong
system programs, large data files and other backup information. Only programs and data currently
needed by the processor resides in main memory. All other informations are stored in auxiliary
memory and transferred to the main memory when needed.

The main memory hierarchy system consists of all storage devices employed in a computer system
from the slow but high –capacity auxiliary memory to a relatively faster main memory, to an even
smaller and faster cache memory accessible to the high-speed processing logic. Memory Hierarchy is
to obtain the highest possible access speed while minimizing the total cost of the memory system.

Memory Hierarchy in computer system

A very high speed memory is called cache memory used to increase the speed of processing
by making current programs and data available to the CPU at rapid rate.The cache memory is
employed in the system to compensates the speed differential between main memory access time and
processor logic.

7.2 Main Memory

The main memory is the central storage unit in a computer system. It is a relatively large and fast
memory used to store programs and data during the computer operations. The principal technology
used for maim memory is based on semiconductor integrated circuits. Integrated circuits RAM chips
are available in two possible operating modes static and dynamic. The static RAM is easier to use and
has shorter read and write cycles.

92
The dynamic RAM offers reduced power consumption and larger storage capacity in a single memory
chip compared to static RAM.

7.2.1 RAM and ROM Chips

Most of main memory in a general- purpose computer is made up of RAM integrated circuit chips, but
apportion of the memory may be constructed with ROM chips. Originally RAM was used to refer the
random access memory, but now it used to designate the read/write memory to distinguish it from only
read only memory, although ROM is also a random access. RAM is used for storing bulk of programs
and data that are subject to change. ROM are used to for storing programs that are permanently
resident in the computer and for tables of constants that do not change in value once the production of
computer s completed. Among other things , the ROM portion is used to store the initial programs
called a bootstrap loader .This is program whose function is used to turn on the computer software
operating system. Since RAM is volatile its content are destroyed when power is turn off on other side
the content in ROM remain unchanged either the power is turn off and on again.

93
7.2.2 Memory Address maps

The designer of computer system must calculate the amount of memory required for particular
application and assign it to either RAM and ROM. The interconnection between memory and
processor is an established from knowledge of the size of memory needed and the type of RAM and
ROM chips available. The addressing of memory can be established by means of a table that specifies
the memory address to each chip. The table, called a memory address map , is a pictorial representation
of assigned address space for each chip in the system.

Example: 512 bytes RAM and 512 bytes ROM

Memory Connection to CPU

1. RAM and ROM chips are connected to a CPU through the data and address buses.
2. The low-order lines in the address bus select the byte within the chips and other lines in the
address bus select a particular chip through its chip select inputs.

94
7.3. Auxiliary Memory

The most common auxiliary memory devices used in computer systems are magnetic disks and tapes.
Other components used, but not as frequently, are magnetic drums, magnetic bubble memory, and
optical disks. To understand fully the physical mechanism of auxiliary memory devices one must have
knowledge of magnetic, electronics and electronics and electromechanical systems.

7.3.1 Magnetic Tapes

A magnetic tape transport consists of electrical, mechanical and electronic components to provide the
parts and control mechanism for magnetic – tape unit. The tape itself is a strip of coated with magnetic
recording medium. Bits are recorded as magnetic spots on the tape along tracks. Usually, seven or nine
bits are recorded simultaneously to from a character together with a parity bit. Read/write heads are
mounted one in each track so that data can be recorded and read as a sequence of characters.

7.3.2 Magnetic Disks

A magnetic disk is a circular plate constructed of metals or plastic coated with magnetized. Often both
sides of disk are used and several disks may be stacked on one spindle with read/write heads available on
each surface. All disks rotate together at high speed and are not stopped or started for access purposes.
Bits are stored in magnetized surface in spots along concentric circles called track. The tracks are
commonly divided into section called sectors. In most systems, the minimum quality of information,
which can be transferred, is a sector.

95
7.3.3 RAID

RAID is an acronym first defined by David A. Patterson, Garth A. Gibson and Randy Katz at the
University of California, Berkeley in 1987 to describe a Redundant Array of Inexpensive Disks a
technology that allowed computer users to achieve high levels of storage reliability from low-cost and
less reliable PC-class disk-drive components, via the technique of arranging the devices into arrays for
redundancy .More recently, marketers representing industry RAID manufacturers reinvented the term
to describe a Redundant Array of Independent Disks as a means of disassociating a "low cost"
expectation from RAID technology.

"RAID" is now used as an umbrella term for computer data storage schemes that can divide and replicate
data among multiple hard disk drives. The different Schemes/architectures are named by the word RAID
followed by a number, as in RAID 0, RAID 1, etc. RAID's various designs all involve two key design
goals: increased data reliability or increased input/output performance. When multiple physical disks are
set up to use RAID technology, they are said to be in a RAID array. This array distributes data across
multiple disks, but the array is seen by the computer user and operating system as one single disk. RAID
can be set up to serve several different purposes.

Purpose and basics: Redundancy is achieved by either writing the same data to multiple drives (known
as mirroring), or writing extra data (known as parity data) across the array, calculated such that the
failure of one (or possibly more, depending on the type of RAID) disks in the array will not result in loss
of data. A failed disk may be replaced by a new one, and the lost data reconstructed from the remaining
data and the parity data. Organizing disks into a redundant array decreases the usable storage capacity.
For instance, a 2-disk RAID 1 array loses half of the total capacity that would have otherwise been
available using both disks independently, and a RAID 5 array with several disks loses the capacity of
one disk. Other types of RAID arrays are arranged so that they are faster to write to and read from than a
single disk.There are various combinations of these approaches giving different trade-offs of protection
against data loss, capacity, and speed. RAID levels 0, 1, and 5 are the most commonly found, and cover
most requirements.

RAID can involve significant computation when reading and writing information. With traditional
"real" RAID hardware, a separate controller does this computation. In other cases the operating system
or simpler and less expensive controllers require the host computer's processor to do the computing,
which reduces the computer's performance on processor-intensive tasks (see "Software RAID" and
"Fake RAID" below). Simpler RAID controllers may provide only levels 0 and 1, which require less
processing.

RAID systems with redundancy continue working without interruption when one (or possibly more,
depending on the type of RAID) disks of the array fail, although they are then vulnerable to further
failures. When the bad disk is replaced by a new one the array is rebuilt while the system continues to
operate normally. Some systems have to be powered down when removing or adding a drive; others
support hot swapping, allowing drives to be replaced without powering down. RAID with hot-
swapping is often used in high availability systems, where it is important that the system remains
running as much of the time as possible.

96
Principles: RAID combines two or more physical hard disks into a single logical unit by using either
special hardware or software. Hardware solutions often are designed to present themselves to the
attached system as a single hard drive, so that the operating system would be unaware of the technical
workings. For example, you might configure a 1TB RAID 5 array using three 500GB hard drives in
hardware RAID, the operating system would simply be presented with a "single" 1TB disk. Software
solutions are typically implemented in the operating system and would present the RAID drive as a
single drive to applications running upon the operating system.

There are three key concepts in RAID: mirroring, the copying of data to more than one disk; striping,
the splitting of data across more than one disk; and error correction, where redundant data is stored to
allow problems to be detected and possibly fixed (known as fault tolerance). Different RAID levels use
one or more of these techniques, depending on the system requirements. RAID's main aim can be
either to improve reliability and availability of data, ensuring that important data is available more
often than not (e.g. a database of customer orders), or merely to improve the access speed to files (e.g.
for a system that delivers video on demand TV programs to many viewers).

7.4 Associative memory


Many data-processing application require the search of items in a table stored in memory. The
established way to search a table is to store all items where they can be addressed in a sequence. The
search procedure is a strategy for choosing a sequence of addresses, reading the content of memory at
each address, and comparing the information read with the item being searched until the match occurs.

The number of accesses to memory depends on the location of item and efficiency of the search
algorithm.

The time required to find the item stored in memory can be reduced considerably if stored data can be
identified for access by content of the data itself rather than by an address. A memory unit accessed by
a content is called associative memory or content addressable memory(CAM).

97
Compare each word in CAM in parallel with the content of A(Argument Register)

- If CAM Word[i] = A, M(i) = 1

- Read sequentially accessing CAM for CAM Word(i) for M(i) = 1

- K(Key Register) provides a mask for choosing a particular field or key in the argument in A(only
those bits in the argument that have 1’s in their corresponding position of K are compared).

Organization of CAM

7.5 Cache memory

The cache is a small amount of high-speed memory, usually with a memory cycle time comparable to
the time required by the CPU to fetch one instruction. The cache is usually filled from main memory
when instructions or data are fetched into the CPU. Often the main memory will supply a wider data
word to the cache than the CPU requires, to fill the cache more rapidly. The amount of information
which is replaces at one time in the cache is called the line size for the cache. This is normally the
width of the data bus between the cache memory and the main memory. A wide line size for the cache
means that several instruction or data words are loaded into the cache at one time, providing a kind of
prefetching for instructions or data. Since the cache is small, the effectiveness of the cache relies on the
following properties of most programs:

Spatial locality -- most programs are highly sequential; the next instruction usually comes from
the next memory location.

Data is usually structured, and data in these structures normally are stored in contiguous
memory locations.

98
Short loops are a common program structure, especially for the innermost sets of nested loops.
This means that the same small set of instructions is used over and over.

Generally, several operations are performed on the same data values, or variables.

When a cache is used, there must be some way in which the memory controller determines whether the
value currently being addressed in memory is available from the cache. There are several ways that this
can be accomplished. One possibility is to store both the address and the value from main memory in
the cache, with the address stored in a type of memory called associative memory or, more
descriptively, content addressable memory.

An associative memory, or content addressable memory, has the property that when a value is
presented to the memory, the address of the value is returned if the value is stored in the memory,
otherwise an indication that the value is not in the associative memory is returned. All of the
comparisons are done simultaneously, so the search is performed very quickly. This type of memory is
very expensive, because each memory location must have both a comparator and a storage element. A
cache memory can be implemented with a block of associative memory, together with a block of
``ordinary'' memory. The associative memory would hold the address of the data stored in the cache,
and the ordinary memory would contain the data at that address. Such a cache memory might be
configured as shown in Figure.

Figure: A cache implemented with associative memory

If the address is not found in the associative memory, then the value is obtained from main memory.
Associative memory is very expensive, because a comparator is required for every word in the
memory, to perform all the comparisons in parallel. A cheaper way to implement a cache memory,
without using expensive associative memory, is to use direct mapping. Here, part of the memory
address (usually the low order digits of the address) is used to address a word in the cache. This part of
the address is called the index. The remaining high-order bits in the address, called the tag, are stored
in the cache memory along with the data. For example, if a processor has an 18 bit address for

99
memory, and a cache of 1 K words of 2 bytes (16 bits) length, and the processor can address single
bytes or 2 byte words, we might have the memory address field and cache organized as in Figure .

Figure: A direct mapped cache configuration

This was, in fact, the way the cache is organized in the PDP-11/60. In the 11/60, however, there are 4
other bits used to ensure that the data in the cache is valid. 3 of these are parity bits; one for each byte
and one for the tag. The parity bits are used to check that a single bit error has not occurred to the data
while in the cache. A fourth bit, called the valid bit is used to indicate whether or not a given location
in cache is valid. In the PDP-11/60 and in many other processors, the cache is not updated if memory
is altered by a device other than the CPU (for example when a disk stores new data in memory). When
such a memory operation occurs to a location which has its value stored in cache, the valid bit is reset
to show that the data is ``stale'' and does not correspond to the data in main memory. As well, the valid
bit is reset when power is first applied to the processor or when the processor recovers from a power
failure, because the data found in the cache at that time will be invalid. In the PDP-11/60, the data path
from memory to cache was the same size (16 bits) as from cache to the CPU. (In the PDP-11/70, a
faster machine, the data path from the CPU to cache was 16 bits, while from memory to cache was 32
bits which means that the cache had effectively prefetched the next instruction, approximately half of
the time). The amount of information (instructions or data) stored with each tag in the cache is called
the line size of the cache. (It is usually the same size as the data path from main memory to the cache.)
A large line size allows the prefetching of a number of instructions or data words. All items in a line of
the cache are replaced in the cache simultaneously, however, resulting in a larger block of data being
replaced for each cache miss.

The MIPS R2000/R3000 had a built-in cache controller which could control a cache up to 64K bytes.
For a similar 2K word (or 8K byte) cache, the MIPS processor would typically have a cache
configuration as shown in Figure . Generally, the MIPS cache would be larger (64Kbytes would be
typical, and line sizes of 1, 2 or 4 words would be typical).

100
Figure: One possible MIPS cache organization

A characteristic of the direct mapped cache is that a particular memory address can be mapped into
only one cache location. Many memory addresses are mapped to the same cache location (in fact, all
addresses with the same index field are mapped to the same cache location.) Whenever a ``cache miss''
occurs, the cache line will be replaced by a new line of information from main memory at an address
with the same index but with a different tag.

Note that if the program ``jumps around'' in memory, this cache organization will likely not be
effective because the index range is limited. Also, if both instructions and data are stored in cache, it
may well happen that both map into the same area of cache, and may cause each other to be replaced
very often. This could happen, for example, if the code for a matrix operation and the matrix data itself
happened to have the same index values.

A more interesting configuration for a cache is the set associative cache, which uses a set associative
mapping. In this cache organization, a given memory location can be mapped to more than one cache
location. Here, each index corresponds to two or more data words, each with a corresponding tag. A
set associative cache with n tag and data fields is called an ``n-way set associative cache''. Usually
, for k = 1, 2, 3 are chosen for a set associative cache (k = 0 corresponds to direct mapping).
Such n-way set associative caches allow interesting tradeoff possibilities; cache performance can be
improved by increasing the number of ``ways'', or by increasing the line size, for a given total amount

101
of memory. An example of a 2-way set associative cache is shown in Figure , which shows a cache
containing a total of 2K lines, or 1 K sets, each set being 2-way associative. (The sets correspond to the
rows in the figure.)

Figure: A set-associative cache organization

In a 2-way set associative cache, if one data word is empty for a read operation corresponding to a
particular index, then it is filled. If both data words are filled, then one must be overwritten by the new
data. Similarly, in an n-way set associative cache, if all n data and tag fields in a set are filled, then one
value in the set must be overwritten, or replaced, in the cache by the new tag and data values. Note that
an entire line must be replaced each time. The most common replacement algorithms are:

Random -- the location for the value to be replaced is chosen at random from all n of the cache
locations at that index position. In a 2-way set associative cache, this can be accomplished with
a single modulo 2 random variable obtained, say, from an internal clock.
First in, first out (FIFO) -- here the first value stored in the cache, at each index position, is the
value to be replaced. For a 2-way set associative cache, this replacement strategy can be
implemented by setting a pointer to the previously loaded word each time a new word is stored
in the cache; this pointer need only be a single bit. (For set sizes > 2, this algorithm can be
implemented with a counter value stored for each ``line'', or index in the cache, and the cache
can be filled in a ``round robin'' fashion).
Least recently used (LRU) -- here the value which was actually used least recently is replaced.
In general, it is more likely that the most recently used value will be the one required in the
near future. For a 2-way set associative cache, this is readily implemented by setting a special
bit called the ``USED'' bit for the other word when a value is accessed while the corresponding
bit for the word which was accessed is reset. The value to be replaced is then the value with the
USED bit set. This replacement strategy can be implemented by adding a single USED bit to
each cache location. The LRU strategy operates by setting a bit in the other word when a value
is stored and resetting the corresponding bit for the new word. For an n-way set associative
cache, this strategy can be implemented by storing a modulo n counter with each data word. (It
is an interesting exercise to determine exactly what must be done in this case. The required
circuitry may become somewhat complex, for large n.)

Cache memories normally allow one of two things to happen when data is written into a memory
location for which there is a value stored in cache:

Write through cache -- both the cache and main memory are updated at the same time. This
may slow down the execution of instructions which write data to memory, because of the

102
relatively longer write time to main memory. Buffering memory writes can help speed up
memory writes if they are relatively infrequent, however.
Write back cache -- here only the cache is updated directly by the CPU; the cache memory
controller marks the value so that it can be written back into memory when the word is
removed from the cache. This method is used because a memory location may often be altered
several times while it is still in cache without having to write the value into main memory. This
method is often implemented using an ``ALTERED'' bit in the cache. The ALTERED bit is set
whenever a cache value is written into by the processor. Only if the ALTERED bit is set is it
necessary to write the value back into main memory (i.e., only values which have been altered
must be written back into main memory). The value should be written back immediately before
the value is replaced in the cache.

7.6 Virtual Memory Concept

In a memory hierarchy system, programs and data are first stored in auxiliary memory. Portion of
program or data are brought into main memory as they are needed by CPU. Virtual memory is a
concept used in some large computer systems that permit the user to construct programs as though a
large memory space were available , equal to the totality of auxiliary memory. Each address that is
referenced by CPU goes through the address mapping from the so called virtual address to a physical
address in memory.Virtual memory is used to give the programmer the illusion that the system has a
very large memory, even though the computer actually has a relatively small main memory.

7.6.1 Address Mapping Memory Mapping Table for Virtual Address -> Physical

address

Address Space and Memory Space are each divided into fixed size group of words called blocks
or pages

1K words group

103
Organization of memory Mapping Table in a paged system

7.6.2 Associative memory page table

Assume that Number of Blocks in memory = m, Number of Pages in Virtual Address Space = n.

Page Table

Straight forward design -> n entry table in memory, Inefficient storage space utilization <- n-m
entries of the table is empty
More efficient method is m-entry Page Table. Page Table made of an Associative Memory that
is m words; (Page Number: Block Number)

104
Virtual address
Page no.

1 0 1 Line number Argument register

1 0 1 0 0 Key register

0 0 1 1 1
0 1 0 0 0 Associative memory
1 0 1 0 1
1 1 0 1 0
Page no. Block no.

Page Fault

1. Trap to the OS

2. Save the user registers and program state

3. Determine that the interrupt was a page fault

4. Check that the page reference was legal and determine the location of the page on the backing
store(disk)

5. Issue a read from the backing store to a free frame

a. Wait in a queue for this device until serviced

b. Wait for the device seek and/or latency time

c. Begin the transfer of the page to a free frame

6. While waiting, the CPU may be allocated to some other process

7. Interrupt from the backing store (I/O completed)

8. Save the registers and program state for the other user

9. Determine that the interrupt was from the backing store

10. Correct the page tables (the desired page is now in memory)

11. Wait for the CPU to be allocated to this process again

12. Restore the user registers, program state, and new page table, then resume the interrupted
instruction.

105
Processor architecture should provide the ability to restart any instruction after a page fault.

106
Multiple choice questions

1.The content of these chips are lost when the computer is switched off?

a) ROM Chips
b) RAM Chips
c) DRAM Chips

2.What are responsible for storing permanent data and instructions?

a) ROM Chips
b) RAM Chips
c) DRAM Chips

3. which memory is used to speedup the processes

a) Associative memory
b) Main memory
c) Cache memory

4.How many bits of information can each memory cell in a computer chip hold?

a).0 bit

b).1 bit

c).8 bits

5.What type of computer chips are said to be volatile

a).ROM Chips

b).RAM Chips

c).DRAM Chips

6. The interface between level-2 (operating system) and level-1 (Microprogram) of a computer
design is called:

a).Computer architecture

b).Virtual machine interface


107
c).User interface

d).All of the above

7.Which memory is actual working memory

a)SAM b)ROM

c).RAM d).TAB MEMORY

8.Which of the following is secondary memory device

a). ALU b).keyboard c). Disk d) All of the above

9. A micro computer has primary memory of 640k . What is the exact number of bytes contained in this
memory?

a). 64x 1000 b).640x100 c).640x1024 d). either b or c

10.How many bits can be stored in 8K RAM

a). 8000 b).8192 c). 4000 d). 4096

Answers

1. a , 2.b , 3. a,4.c ,5.b ,6.d ,7. d ,8. c , 9. c,10.d

108

Вам также может понравиться