Вы находитесь на странице: 1из 502
INTRODUCTION TO MICROPROCESSORS THIRD EDITION ADITYA P MATHUR Introduction to MICROPROCESSORS Third Edition Aditya P. Mathur Purdue University W. Lafayette, USA Tata McGraw-Hill Publishing Company Limited New Delhi McGraw-Hill Offices New Delhi New York St Louis San Francisco Auckland Bogota Guatemala Hamburg Lisbon London ‘Madrid Mexico Milan Montreal Panama Paris San Juan So Paulo Singapore Sydney Tokyo Toronto a NS L==s Tata McGraw-Hill Copyright © 1989, by Tata McGraw-Hill Publishing Company Limited. 24" reprint 2006 RCLDRRAKRZYQQ No part of this publication may be reproduced or distributed in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise or stored in a database or retrieval system without the prior written permission of the publishers. The program listings (if any) may be entered, stored and executed in a computer system, but they may not be reproduced for publication. This edition can be exported from India only by the publishers, Tata McGraw-Hill Publishing Company Limited ISBN 0-07-460222-5 Published by the Tata McGraw-Hill Publishing Company Limited, 7 West Patel Nagar, New Delhi 110 008 and printed at AP Offset Pvt. Ltd., Zulfe Bengal, Dilshad Garden, Delhi 110 095 Mihi hdk ee rey CONTENTS Preface 1.6 ORGANIZATION OF MICROCOMPUTERS. ....... 7 1.7 MICROPROCESSOR PROGRAMMING .......... 9 1.7.1 Instructions............. nn) 1.7.2__Machine and Mnemonic Codes... 10 1.7.3 Machine and Assembly Language Programming... 11 1.7.4 High Level Language Programming........... 12 18 DIGITAL LOGIC sxazsou ¢ 2 ¥ c Meee 4 BS ESOS 12 18.1 Digital and Analog Signals 12 1.8.2 Digital Building Blocks . . 13 18.3 Signal Levels ........ a, 21 1.8.4 Device Loading. ........... ws 24 1.8.5 Open-Collector and Totem-Pole Devices . . . 25 1.9 TIMING DIAGRAM CONVENTIONS. . . siemens a OE 1.10 SUMMARY... 0... eee 28 2 DATA REPRESENTATION 31 94) INTRODUCTION 2oc0e seas 4 a 6 easeames 31 2.2 POSITIONAL NUMBER SYSTEMS... . wane > SL 2.3 THE BINARY NUMBER SYSTEM 33 2.3.1 Concepts ........-. 33 2.8.2 Binary to Decimal Conversion. . . . 33 2.3.3 Decinial to Binary Conversion. . . . 35, 2.4 REPRESENTATION OF INTEGERS .. 36 2.4.1 Positive Integers... . . 36 2.4.2 Maximum Integer 37 2.4.3 Negative Number Representation . . 38 24.4 Minimum Integer oe 4 2.4.5 BCD Representation... 42 2.5 REPKESENTATION OF REAL NUMBERS . 43 25.1 Conversion of RealNumbers ............. 43 xiii xiv CONTENTS 2.5.2 Floating Point Notation ...............4 45 2.5.3 Representation of Floating Point Numbers... .. . 46 2.5.4 Accuracy and Range in Floating Point Representation 48 26 BINARY ARITHMETIC ................-.. 51 2.6.1 Addition and Subtraction of Binary Integers .... 51 2.6.2 Overflow and Underflow ....... ‘ 55 2.6.3 Addition of Floating Point Numbers . 56 2.7 OTHER NUMBER SYSTEMS 60 2.7.1 Some Gonventions ...... os 61 2.8 CHARACTER REPRESENTATION. ............ 61 2.9 SUMMARY 62 3.3 3.4 3.5 3.9 3.2.1 Data and Address Busses . 66 32.2 Addressing the I/O Devices . . i 67 3.2.3 Registers in the 8085... 2.0... 0.2.20 000- 68 INSTRUCTION SET OF THE 8085 ............. 69 aol: Instrection Dy pee: sess eee aes 70 3.3.3 Addressing Modes .............. see. 71 3.3.4 Space and Time Requirements .. . . _ 5 PROGRAMMING THE 8085 .. . . re) 3.4.1. The Programming Process. . a 6 3.4.2 Machine Language Programming . . . 8 ASSEMBLER PROGRAMMING 80 3.5.1 The Instruction Format 81 3.5.2 Assembler Directives . 83 3.5.3 Constants in an Assembly Program. ......... 85 = THE ZILOG Z80 3.8.1 Organization of the Z80 3.8.2 Z80 Addressing Modes 38.3 Input and Output Instructions Sa aaa ine Calla in Z80 121 SUMMARY 121 CONTENTS xv We TROPRODUOTION, 55 5 ioc sas 4.1.1 Memory Types... 4.2.1 Memory Chip Capacity and Organization... .. . 134 4.2.2 Electrical Signals 4.4.1 Organization of 51100x_............. 146 4.4.2 Timings of 51100x z 147 4.4.4 Page Mode Operation of Dymani RAMs ...... 155 4.4.5 Nibble Mode Operation . . . 156 44 MiNi MOG nee ergve ry eyes Hye pips 157 4.4.7 Power Requirements of DRAMs... . cones 168 4.48 Soft Errors in Dynamic RAMs ............ 158 4.5 REPROGRAMMABLE ROMs ................ 159 4.5.1 Organization of EPROMs 4.5.2 Timing of EPROMs ..... 4.5.3 Programming the EPROM - 4.5.4 Electrically Erasable EPROMs............. 164 4£.6:1,: | MTBF Computation «0.5.90 6is28a0 6 sae 168 46.2 Error Detection Using Parity ............- 168 4.7 SUMMARY 170 DiZeh SRB ONCOD UB os sacs sales apart nest ecsicomspecrreeas 174 5.2.2 The Fetch Operation we a 5.2.3 The Execute Cycle . . ae ae Se 5.2.4 Machine Cycle and Stale screen & 222 pamews 176 5.3.2 Opcode Fetch Cycle ..... 0.0.22... 00, 5.3.3 Memory and I/O Read Cycles 5.3.4 Memory and I/O Write Cycles . 5.3.5 Interrupt Timings 5.3.6 Interrupt Acknowledge Machine Gyele aa 5.3.7 Bus Idle Machine Cycle ........--.+..-5 xvi CONTENTS 5.3.9 Initiating System Operation . 188 5.3.10 State Transition Sequence 188 5.4 TIMINGS OF ZILOG Z80.. . 189 5.4.1 Z80 Signals ....... . 189 5.4.2 Overall Timing Structure re zs ». 191 5.43 Machine Cycle Timing. ...........s.ss5 191 5.5.1 General Purpose (Scratch Pad) Registers (GPRs) . ._196 5.5.2 Number of General Purpose Registers ....... 198 Gia kine Aga eee MER aoe ove sonseeter at ie ca eusteetnae seid 6.2.2 Address Decoding ......... E 6.2.3 Using the Binary 1-of-n Decoder 6.3 MEMORY INTERFACING . ow... ND 6.3.1 Bus Contention and 2-line Control .". . 212 6.3.2 Access Time Computations .............. 215 E MMED DAT: AN: FEI 6.5.1 Synchronous Transfer . 218 6.5.2 Asynchronous Transfer . 3 » 921 6.5.3 Interrupt Driven Data Transfer... ......... 295 6.5.4 Multiple Interrupts.............-++++ 231 6.5.5 Enabling, Disabling, and Masking of Interrupts... 239 6.6 DIRECT MEMORY ACCESS DATA TRANSFER ..... 243 6.6.1 Multiple DMA Devices............ 247 6.6.2 DMA Transfer in an 8085 Based System . . .. 248 6.6.3 DMA Transfer in a Z80 Based System ....... . 249 6.7 SERIAL DATA TRANSFER ..............005 250 6.8) SUMMARY sccm x 9 4 2 4 foe & 2 xy REIORN BS 252 7 INTERFACING DEVICES 261 TL. INTRODUCTION i.2.: «5 2) isis 5 4s Hw FS 261 7.2. TYPES OF INTERFACING DEVICES ........... 261 7.4 ADDRESS DECODING FOR I/O .. . 262 74 INPUT/OUTPUT PORTS .. 263 7.4.1 Programmable I/O ports . eee 265 7.4.2 Programmable Peripheral Interface. ........ 265 74.3 Programming the 8255A.. 00.00.0000 00 269 74:4 Timings of the 8255A Operations..........- 272 CONTENTS xvii 7.4.5 Applications of the 8255A . . 276 75 PROGRAMMABLE INTERRUPT CONTROLLER . - 285 7.5.1 Programming the 8259 boa 75.2 Cascading of 8259s........ 76 PROGRAMMABLE DMA CONTROLLER 7.6.1 8257 Programmable DMA Controller 7.6.2 Dynamic RAM Refresh Using DMA .. . . vee 7.6.3 Dynamic RAM Refresh in a Z80 Based System ... 304 7.7 COMMUNICATIONS INTERFACE ...........%.% 305 7.7.1 The 8251 USART 78 ANALOG INPUT DEVICES . 7.8.1 The AD 7820 Analog-to-Digital Converter 7.8.2 Interfacing the AD 7820 79 ANALOG OUTPUT DEVICES . 312 7.9.1 The AD557 Digital-to-Analog Converter . 312 7.9.2 Interfacing the AD557.......-... ... 314 7.10 ANALOG INPUT SUBSYSTEMS .. . was oo. 815 7.11 ANALOG OUTPUT SUBSYSTEMS ........-.. - + 320 7.12 SUMMARY ... 2.0.0... cee eee 321 APPLICATIONS OF MICROPROCESSORS 323 8.1 INTRODUCTION ............ 002 eee eeuee 323 8.2 A TEMPERATURE MONITORING SYSTEM ....... 323 8.2.1 System Requirements 8.2.2 Overall System Design . 8.2.3 The Input Subsystem . 8.2.4 The Output Subsystem 8.2.5 Hardware Design... . . 8.2.6 Software Design 8.2.7 Discussion............. 8.3 AUTOMOTIVE APPLICATIONS . 8.3.1 Ps or pC’s inside an automobile? 8.3.2 Special purpose or off-the-shelf? 8.3.3 Why use pCs in automobiles? 8.3.4 Engine Control .......... 8.3.5 Suspension System Control . . . 8.3.6 Driver Information Systems . . 8.4 CLOSED LOOP PROCESS CONTROL wea 8.4.1 The Process of Growing Synthetic Quartz . . . . 8.4.2 Microprocessor Based Control System .. . ws 8.5 SUMMARY ....... 0.0 pce eee ee eee 367 xviii CONTENTS 9 MICROPROCESSOR-BASED DEVELOPMENT AIDS 372 9.1 INTRODUCTION ....... 0.0... 00000. e eee 372 9.2 SOFTWARE DEVELOPMENT AIDS . 373 9.2.1 System Monitor 373 9.2.2 Text Editor... oie 375 9.2.3 Operating System mica ea 376 9.2.4 Multitasking and Multi-user. . . 377 9.2.5 ROM and Disk Resident OS. . . eure 380 9.2.6 Gonstituents of an Operating System. . . . 381 9.2.7 Assembly Techniques 384 9.2.8 ‘The Macro-Assembler . 388 9.2.9 Compilers ........ 389 9.3 HARDWAREAIDS ...... 392 9.3.1 Single-Board Computers 392 9.3.2 System Design Kits . . wee 804 9.3.3 Other Miscellaneous Hardware Development Ai . 304 9.4 MICROCONTROLLERS ..............--006 396 9.4.1 Architecture of NEC pPD7810 . oe «++ 306 9.5 SUMMARY ........ 00. cece ee eee ee 413 10 ARCHITECTURE AND PROGRAMMING OF 8086 AND soss . 416 10.1 INTRODUCTION ......... 0.20000 ee eeue 416 10.2 ORGANIZATION OF THE 8086. ......-....4.. 417 10.2.1 Memory Organization ..........0...00. 417 10.2.2 Addressing Bytes and Words . . ee ++. 418 10.2.3 Register Structure ........ 421 10.2.4 Addressing Modes in 8086 . 425 10.2.5 Organization of the 8088...... 430 10.3 PROGRAMMING THE 8086/8088 . .. . 431 10.4 BUS STRUCTURE AND TIMING OF 8086 452 10.4.1 Bus Interface and Execution Units .... . + 453 10.4.2 BusCycles 2... 2.2... ee eee eee eee 453 10.4.3 Generating Control Signals in Maximum Mode ... 455 10.4.4 Indivisible Instruction Cycle 457 10.4.5 Bus Arbitration Logic . eee om 458 10.4.6 Status Signals... ........-002005 459 10.6 BUS STRUCTURE AND TIMING OF 8088 460 10.6 EXCEPTION HANDLING 461 10.6.1 Privilege States... ... 461 10.6.2 Exception Types .... - 462 10.6.3 Exception Processing . 462 10.7 463 EXCEPTION HANDLING IN THE 8086/8088 |. 10.7.1 External Interrupts . 10.7.2 Internal Interrupts CONTENTS xix 10.7.3. Divide by Zero... 2. 2... ee 10.7.4 Single Stepping or Tracing. 10.7.5 Reset Exception 10.8 SUMMARY........... 11 ARCHITECTURE AND PROGRAMMING OF 68000 472 11.1 INTRODUCTION . . 11.2 ORGANIZATION OF THE 68000 11.2.1 Memory Organization ........ Han 41.2.2 Register Structure ........--.-00048 11.2.3 Addressing Modes ........-..-.045 11.3 PROGRAMMING THE 68000 ...... cna 114 BUS STRUCTURE AND TIMING OF 68000 11.4.1 Read and Write Cycles . : 11.4.2 Read - Modify - Write Cycle 11.4.3 Bus Arbitration Timing .... . . 11.5 EXCEPTION HANDLING we ie 11.5.1 Privilege States... . 11.5.2 Exceptions in 68000 . 11.5.3. External Interrupts . 11.5.4 Exceptions During Normal Instruction Execution . . 497 11.5.5 Privilege Violations, Illegal, and Unimplemented In- MPUCHONS os ex x ee aR ww Ke BARN eae 11.5.6 Single Stepping . . . . 11.5.7 Address Error... . . 11.5.8 Bus Error ....... 11.5.9 Reset Exception 11.5.10 Multiple Exceptions . 11.6 MULTIPROGRAMMING . . 11.6.1 The Concept ..... 11.6.2 Program Relocation in Multiprogramming . 501 11.6.3 Relocation in 8086 tae 502 11.6.4 Relocation in 68000 502 11.6.5 Resource Sharing 503 11.6.6 Semaphores in 68000 . 505 11.6.7 Protection... .... <8 58 x Sei F 505 11.6.8 Protection in 68000 ...... 505 11.7 MULTIPROCESSING . . . 506 11.8 SUMMARY......... 507 A 8085 INSTRUCTION SET 510 B_Z80 INSTRUCTION SET CA C 8086/8088 INSTRUCTION SET 559 xx CONTENTS D 68000 INSTRUCTION SET 576 E OPEN LOOP TEMPERATURE CONTROLLER 588 Index Chapter 1 MICROPROCESSORS: BASIC CONCEPTS 1. OBJECTIVES OF THIS BOOK ‘The purpose of this book is to introduce you to the world of microproces- sors. Often in this text, we shall use the abbreviation »P for the word microprocessor. #P may be read as mu P. Similarly, uC will be used as an abbreviation for the word microcomputer or microcontroller depending on the context. pC may be read as mu C. A microprocessor is an electronic device which is of little use unless in- terfaced with memories and several other input and output devices. Thus, a study of microprocessors implies a study of a variety of memory chips, input/output devices, and techniques for interfacing them to the micro- processor. This book concentrates on both these aspects of the study of microprocessors. An electronic system which is centered around a micro- processor, will often be referred in this book as a microprocessor-based system. Like any other digital computer, a system designed around a micropro- cessor needs to be programmed. Thus, a sequence of instructions needs to be formulated and input to the microprocessor-based system for effec- tive operation. A sequence of instructions designed to perform a particular task is known asa program. A set of programs written for a microprocessor- based system is known as the software for that system. This book also aims at teaching the programming of microprocessors. Those already familiar with the programming of digital computers (both machine language and high level language programming) will not find much new material except, perhaps, for the instruction set that we shall present for a few micropro- cessors. In general, both hardware and software design are of paramount importance in microprocessor-based system design, though, in particular cases, one may be more difficult or significant than the other. There are many distinct approaches by which the subject of micropro- cessors can be introduced to a novice. One approach could be the general- to-specific, according to which general concepts regarding the architecture and programming of microprocessors are introduced without reference to any particular microprocessor. Specific microprocessors are considered only 2 MICROPROCESSORS: BASIC CONCEPTS as examples. According to another approach, a particular microprocessor may be introduced first, followed by a generalization of the concepts and their illustration using other microprocessors. In-this book, the second approach has been adopted. ‘The specific microprocessor that we have chosen for illustration of ar- chitectural and programming concepts, is Intel’s 8085 which continues to be a popular 8-bit chip. Our choice has fallen on the 8085 because of its popularity and simple architecture. Thus, it is easy to explain and un- derstand basic microprocessor related concepts using the 8085. The Z80 has been used as another chip to provide variety in our exposition and to introduce concepts not available in the 8085. The 8086/8088, 68000, and 7810 yPs have been used in later chapters to introduce additional concepts and techniques. In order that you gain a, thorough knowledge of the techniques for inter- faving microprocessors with the real world, detailed design examples have been presented. A microprocessor laboratory available for experimentation would be an asset. We have attempted to introduce the latest developments in the field of microprocessors. However, as this is a rapidly developing field, you may find several chips of choice not in this book. It may-be noted that this book is not a compendium on microprocessors. Its prime purpose is to illustrate basic concepts at the level of a first course in microprocessors. Thus, several advanced microprocessor and other chips have not appeared in this book. Examples of such chips include the Intel 80386, Motorola 88000, and Inmos T800 Transputer. 1.2 WHEEL AND THE MICROPROCESSOR The wheel is considered to be the precursor of the industrial revolution. It has been responsible for improving the mobility of objects. It helped man move objects of several kinds faster. ‘The iP has created another revolution. It has helped improve the con- trol and monitoring of objects. We see wheels all around us in objects that move, from huge trains to smaller automobiles to even smaller toys. The #P has now made its way into all these objects. Certainly, we can still find automobiles that do not have at least one pP. But such automobiles are considered to be based on an obsolete design. We can also find toys without a yP in them. But these toys are extremely low cost items. In addition to being a prime control processor in an automobile and many other moving objects, the #P has made its way into a variety of other items. A sample list is given below. © Measuring instruments-such as the oscilloscope, multimeter, and the spectrum analyzer. Music related equipment such as synthesizers. WHAT IS A MICROPROCESSOR? 3 ¢ Household items, such as the microwave oven, door bell, washing machine, and television. © Defense equipment, such as fighter planes, missiles and radar. Computers, such as the IBM PS/2 and other personal computers, and parallel computers, such as the NCUBE/10 and Symmetry. Medical equipment, such as blood pressure monitors, blood analyzers, and monitoring systems. ‘The uP has helped create new fields, desktop publishing being one exam- ple. It has helped the growth of several existing fields, medical electronics being one example. ‘Though the 4P may yet not be recognized to be as important to the progress of mankind as the wheel, it plays a significant and an ever-expanding, though indirect, role in making life more comfortable. The pP in itself has become a very powerful tool for scientists and engineers. A knowledge of the design of zP based systems is therefore of great importance and benefit to students in many fields. 1.3 WHAT IS A MICROPROCESSOR? ‘This book will provide a detailed answer to this question. Here, we take a quick look at a pP. A pP is just like any other electronic chip. However, it is more powerful than just any other chip. For the present, we can consider a pP to be a device that: @ has a limited set of on-chip memory locations, known as registers, to hold information, can understand a fixed set of basic commands, and © can generate signals to control external devices, Inside the chip, there is an arithmetic logic unit (ALU). The ALU ex- ecutes all arithmetic and logic instructions. For example, the arithmetic addition and logical AND operations will be performed by the ALU. The registers inside the #P hold data on which the operations are performed. The control unit generates the external control signals and also controls the operation of the internal on-chip circuitry. The on-chip memory, in the form of registers, is generally very limited. Thus, almost every 1P based sysiem has an off-chip memory also. ‘The set of basic commands that a uP can understand, is known as the instruction set of the uP. Fig. 1.1 shows the signals available in a typical 4P chip. The chip itself has several pins, like any other chip. The j¢P sends or receives information over these pins. Sometimes we refer to the connections between the pins, or the pins themselves, as lines. Each pin transmits or receives a boolean signal which is either at logical 0 or at logical 1, Some pins may be in 4 MICROPROCESSORS: BASIC CONCEPTS MICROPROCESSOR: Power ADDRESS ARITHMETIC SUPPLY cane LINES INPUT AND ‘ourpuT REGISTERS DATA CONTROL Lines LINES CONTROL ‘UNIT OTHER CONTROL LINES Figure 1.1: Inside and outside @ microprocessor. neither of these two states at certain points in time. Such pins, or lines, are said to be tri-state outputs. As shown in Fig. 1.1, a typical »P has several lines over which it trans- mits an address to the off-chip memory or to the I/O devices. ‘These are referred to as address lines. More often, the address lines are known as the address bus. A typical 8-bit uP will have an address bus consisting of 16 lines for transmitting the address. Thus, such a pP can transmit on address that is 16-bits wide. How many different addresses can be transmitted? A knowledge of the binary number systems tells us that in all 21° = 65386 different addresses cam be transmitted. (See Chapter 2 for more details about number systems.) Every #P has a set of lines for transmitting and receiving data. These lines are referred to as the data bus. An 8-bit uP will generally have eight- lines in its data bus. A 16-bit uP may have either 16 lines in its data bus, or only eight-lines. For example, the 8085 P from Intel is an 8-bit uP and has eight-lines in its data bus. The 8086 yP, also from Intel, has 16-lines in its data bus and is a 16-bit ~P, However, the 8088, again from Intel, is a 16-bit »P but has only eight-lines in its data bus. A lesser number of data bus lines are provided to reduce the space required on a printed circuit board for the data bus lines. In a later section we shall learn what we mean by saying that a /P is.an 8-bit or a 16-bit pP. A uP also has lines for controlling the input and output devices. These devices could be an electric motor, or a display lamp or one of a variety of other devices that we shall mention in the remainder of this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 6 MICROPROCESSORS: BASIC CONCEPTS this, and a few other reasons, the 32-bit Ps are much faster than the 16 or & bit pP s. How fast is a 32-bit P as compared to an 8-bit uP? This question is difficult to answer unless two specific Ps are being compared. Just to get an idea, let us examine how fast the Intel 80386, a 32-bit P, will add two 32-bit numbers as compared to the 8-bit Intel 8085. Assuming that the 8085 is operating at 5 MHz, it will perform this addition in about 30us. Note that 1 ys is one millionth of a second. An 80386 operating at 25 MHz can perform the same addition in as little as 50 ns. Note that one nanosecond (ns) is one trillionth of a second. 1.5 EVOLUTION OF MICROPROCESSORS ‘The history of zP development is very interesting. The first microprocessor was introduced in 1971 by Intel Corporation. This was the Intel 4004, a processor on a single chip. It had the capability of performing simple arithmetic and logical operations, e.g. addition, subtraction, comparison, AND, and OR. It also had a control unit which could perform various control functions like fetching an instruction from the memory, decoding it and generating control pulses to execute it. It was a 4-bit microprocessor operating on 4 bits of data at a time. The first microprocessor was quite a success in industry. It found many applications, and attracted much attention from both application engineers and the semiconductor industry. Soon, other microprocessors were also introduced. Intel itself followed by introducing an enhanced version of 4004, the 4040. Since then, many other 4-bit microprocessors have been introduced. Rockwell International’s PPS4 and Toshiba's T3472 are two examples. 4-bit xP s are extinct now. 4-bit microcontrollers are still used in large quantities in applications like toys, appliances, and instruments. One of the most successful 4-bit microcontroller is the TMS 1000 fromm Texas Instruments. ‘The first 8-bit microprocessor, which could perform arithmetic and logic operations on 8-bit words was introduced in 1973, again by Intel. This was the 8008 that was followed by an improved version— the 8080 from the same company. Today, there are a variety of 8-bit processors, some examples being Motorola’s M6800, National Semiconductors’ SC/MP, Zilog Corporation’s Z80, Fairchild’s F8, Intel’s 8085, and Hitachi’s 6809. ‘The 8-bit microprocessor was followed by microprocessors operating on 12-and 16-bit data words, respectively. Intersil’s IM6100 and Toshiba’s T3190 are examples of 12-bit processors. Examples of 16-bit microproces- sors are Fairchild’s 9440, Data General’s mN601 and Texas Instrument’s TMS 9900. Intel’s 8086 and 80286, Motorola’s M68000 and Zilog’s 28000 are some of the most powerful 16-bit microprocessors available today. One of the most popular 16-bit 4Ps has been the 8088 from Intel. The 8088 has the game instruction set as the 8086. However, it has only an 8-bit aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 10 MICROPROCESSORS: BASIC CONCEPTS Ages of Data (a) Operation Code Data (b) (c) Figure 1.3: Typical instruction formats (a) Direct addressing (b) Immediate addressing (c) Implicit addressing, Each of these three instructions can be referred to by abbreviated codes known as mnemonics. We assign the mnemonics ADD, LDI, and SLA, respectively, to each of the above mentioned instructions. Now, suppose that we desire to load the accmulator with the constant 7, and then add to it the value of A, where A designates some memory location. Then the following sequence of instructions can be given to the microprocessor: LDI 7 ADD A The first of these two instructions will cause the value 7 to be loaded into the accumulator. The-second one will cause the contents of memory location designated by A to be added to the accumulator. Note that these two instructions are of the type shown in Fig. 1.3(b) and (a), respectively. If the contents of the accumulator are to be shifted left by 2 bits, then the following instruction couild be used: SLA 2 This instruction is of the type shown in Fig. 1.3(c), where neither the operand nor the address of the operand appears directly in the instructions. o 1.7.2 Machine and Mnemonic Codes An instruction can be written in a variety of forms. In Example 1.1, we wrote instructions using their mnemonic codes and symbolic addresses. However a microprocessor can decode and execute only binary coded in- structions. Therefore, for each operation that can be executed by a mi- croprocessor, there is also a binary code. We may write, instructions using aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 14 MICROPROCESSORS: BASIC CONCEPTS TRUTH TABLE ‘ ENABLE x: Oort Figure 1.6: Tsi-state inverter. input is at logic 0, which implies that it is asserted, the device behaves as an ordinary inverter. When the control signal is asserted, the output goes into the hi-Z state. Logic Gates A gate is a device that transforms two or more digital inputs to a single digital output. Fig. 1.7 exhibits the AND, OR, XOR, NAND, NOR, and XNOR gates, and their truth tables. ‘The gates shown in Fig. 1.7 are two-input devices. One cap have three- input, four-input or more input gates. Fig. 1.8 shows a three-input AND gate and its truth table. Multiplerer and Demultiplezer A multiplexer is a device that outputs one out of several input signals. Fig. 1.9 shows a 4-to-l multiplexer and its circuit diagram using NAND and OR gates. A multiplexer has 2” data inputs and n control inputs. Depending on the status of the control inputs, exactly one of the data inputs is routed to the output. In Fig. 1.9, Do-D denote the four data inputs. A and B are the two control inputs. The truth table shown in the same figure indicates that when A and B are both at logic 0, the output is the same as Do. When A and B are both at logic 1, the output is the same as Ds. Fig. 1.10 exhibits the signals of the 74151, 8-to-1 multiplexer chip. A demultiplexer has one input signal, n control signals, and 2” outputs. Depending on the status of control signals, the input signal is routed to one of the outputs. Fig. 1.11 exhibits the signals of a 1-to-8 demultiplexer. Exercise 1.13 requests a truth table for a demultiplexer. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 18 MICKOPROCESSORS: BASIC CONCEPTS Inputs Output A BC Control! Figure 1.10: 74151 8-to-1 multiplexer. with an irregular clock. Flip-flops, latches, and registers A flip-flop is an active device as opposed to gates and inverter which are passive devices. This implies that a flip-flop has some way of remembering its state, Thus, when an input is applied to a flip-flop, its output depends on the input and its current state. We shall use FF as an abbreviation for flip-flop. There are several types of flip-flops. A commonly used flip-flop is the clocked D-flip-flop also known as D-type flip-flop, shown in Fig. 1.14(a). As shown, it has one data input, a clock input, and two control inputs denoted by PRESET and CLEAR. It generates two outputs denoted by Q and Q. The D-type FF shown in the figure is an edge triggered flip-flop as indicated by a V sign at the clock input of the flip-flop. The function of clocked circuits can be described using timing diagrams. Fig. 1.14(b) shows the timing diagram of the D-flip-flop. In the timing diagram, notice that the output of the flip-flop changes only at the rising edge of the clock input. Thus, when the clock goes high and the data input is also high, the Q output goes high. Similarly, when the data input is low and the clock goes high, the Q output changes to low. Thus, the output is not affected by the data input except at the rising edge of the clock. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 22. MICROPROCESSORS: BASIC CONCEPTS PRESET - Clock- Data input Data Clock—p> p v Q Q CLEAR (b) (a) Figure 1.14: D flip-flop (a) schematic _(b) timing diagram. 8282 Octal Latch DO, STB Figure 1.15: Intel 8282 octal latch. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 26 MICROPROCESSORS: BASIC CONCEPTS (a) (&) Figure 1.19; Current loadings of TTL devices. The voltage level specifications for an open-collector device are the same as that for a totem-pole device. However, the output current directions are different. In the case of an open-collector device, the Io, known as the leakage current, has a different value than that for a totem-pole device. For example, for the 7405, Io is specified as a maximum of 0.25 mA. How does one find the value of the pull-up resistor? As shown in Fig. 1.20, both the circuits can be examined to compute the value of R. In Fig. 1.20(a), R. can be obtained as: R= (Vee — Vor)/(Ion — liz) Similarly, for Fig. 1.20(b), it can be obtained as: R= (Veo — Vou)/Uon + Irn) Using the standard values for the different parameters, we can easily obtain R to be approximately 320 for case (a) and 89700 for case (b). Which value of R should be used? A low value of R gives a better speed of operation but a larger current when the output of 7405 is at logic 0. One can strike a compromise value of 2.2 KO. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 30 MICROPROCESSORS: BASIC CONCEPTS 1.11 Why is it necessary to examine current requirements when interfacing one device to another? 1.12 What could be the advantage of using a tri-state inverter over an ordinary inverter? 1.13 Develop the truth table for a 1-to-4 demultiplexer. 1.14 Using NAND and OR gates, design a 1-to-4 demultiplexer. 1.15 When would a buffer be used in a digital circuit? 1.16 Using 74LS138 decoders, show how a 4-to-16 decoder can be designed. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 34 DATA REPRESENTATION Table 2.2: Sample Conversion Zero One Two Three Four Five Six Seven Eight Nine Ten Eleven Twelve ‘Thirteen Fourteen Fifteen HoH oHoHoHoHoHoHnoe which is a short form for stating that the binary number 1101 is equivalent to the decimal number 13. Using the above expansion method, we can convert any positive binary integer to its equivalent decimal form. Another simple method for converting a binary number to its decimal equivalent is known as the double-babble method. This can be described as follows. Begin the conversion process by doubling the leftmost bit of the given number and add to it the bit at its right. Then, double the sum and add to it the third bit from the left. Proceed in this manner till all the bits have been considered. ‘The final sum, obtained by repeated doubling and adding, is the desired decimal equivalent. Let us illustrate this method with an example. Example 2.2 ‘To convert 1101 to its decimal equivalent using the double-babble method, we proceed as follows: 1. Doubling the leftmost. bit we get 2. 2. Adding to it the bit on its right we get 2+1= 3. Doubling the sum obtained above, we get 3 x 2 4. Adding to it the next bit, we get 6+0=6 5. Again doubling, we get 6 x 2= 12. 6. Finally, adding the last bit, we get 12+ 1 = 13. Thus we have the relation aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 88 DATA REPRESENTATION Table 2.5: Maximum Integers for Various Word Lengths ‘Word length Maximum Decimal Expressed. (in bits) binary number __ equivalent in 2" form 2 ol 3 oll 4 onl 7 2-1 8 01111111 127 27-1 16 OWI111111111 _ 32767 251 Figure 2.3: Representation of -44 using signed magnitude. 1). Here, we have assumed that one out of n bits is used for denoting the sign of the number. Note that the leftmost bit remains zero. It may be worth mentioning at this point that though most pPs are designed to perform arithmetic operations in a fixed number of bytes, the user can always write programs to represent and manipulate the larger integers. 2.4.3 Negative Number Representation There are three commonly employed schemes for representing the negative integers in binary form. These are 1. the signed magnitude scheme, 2. the one’s complement scheme, and 3. the two’s complement scheme. We shall now explain each of these schemes and bring out their charac- teristics. The Signed Magnitude Scheme In this kind of representation, the sign bit is treated as separate from the magnitude bits. Thus the representation for, say +44 and -44, will be identical except for the sign bit. In Fig, 2.2 we have already shown how +44 can be represented. Fig, 2.3 shows the representation of -44 using the signed magnitude scheme. As another example, Fig. 2.4 shows the representation of 127 (minimum number possible in 8 bits) using the signed magnitude scheme. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 42 DATA REPRESENTATION Table 2.6: Minimum Integers Minimum number in n-bits (including one for sign) Signed magnitude _—(@"-T—1) One’s complement —(2"~! — 1) Two's complement _—(2"-!) Scheme Table 2.7: Word Length and Integer Range in Selected Microprocessors uP Word length Minimum faximum (in bits) integer integer COP 400 4 8 7 Tntel 8085, Zilog 280, 8 “128 a7 280, Motorola 6809, Mostek 6802 Intel 8086, 8088, 16 “32,768 32,767 80286 Intel 80386, 32 ~2,347,483,648 — 2,347,483,647 Motorola 68030 scheme is suitable but has certain disadvantages while the two's complement scheme is very convenient. The concept of maximum and minimum integers, as applied to pPs, is flexible. Using complex software, one may always handle larger or smaller numbers, and use any arbitrary number representation scheme. The maxi- mum and minimum integers mentioned in Table 2.7 are the ones on which the microprocessor can perform arithmetic by its hardware. 2.4.5 BCD Representation A simple way of representing integers is by using the BCD code. BCD is an abbreviation for Binary Coded Decimal. When representing a decimal integer using the BCD code, each digit of the integer is represented sepa- rately using a 4-bit binary code. Table 2.8 lists the BCD codes fot all the nine decimal digits. ‘As an example, if the BCD representation is used, the decimal number 345 will be represented in binary, as: 0011 01000101. The primary disadvantage of BCD representation over the one’s or two’s complement representation is that it requires too much space. For example, in 16-bits, the largest unsigned decimal number that can be represented using BCD is 9999. In contrast, if the two’s complement representation is used, we know that the largest possible unsigned integer that can be represented is 65,535. The'largest possible signed number using two’s com- plement would be 32767. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 46 DATA REPRESENTATION M M+t M+2 M+3 7654321076543210 76543210 78543210 aaa masts [_]———}t_ aurea | [_etronenr_] Sign bit of Mantissa 0 - Positive Number M = Memory Address 1 - Negative number Figure 2.8: A storage format for floating point real numbers in standard notation. may be expressed in the normalized form as: 13.2 x 108 = 0.132 x 107? 0.0152x 10° = 0.152 x 107? 1.91x 10-78 = 0.191 x 10-8 In general, normalized floating point numbers are of the form m x 5°, where m, the mantissa, satisfies the relation } < m < 1, 6 denoting the base of the number system, and n the exponent, Two binary floating point numbers in normalized form are 0.1101 x 2! and 0.100 x 2-19. However, 101.0 x 2 is not in the normalized form and neither is 0.001 x 27. ‘To be in the normalized form, a 1 must follow the binary point. The only exception to this rule is the representation of 0.0. 2.5.3 Representation of Floating Point Numbers We shall assume for our'discussion a particular fast arithmetic device. We shall call the format requirements of that device as the format requirement of our microcomputer, and shall represent real numbers in the memory according to this format We assume a device that accepts 32-bit floating point numbers. The way the floating point number is distributed among these 32 bits is shown in Pig. 2.8. Note that 32 bits’is 4 bytes (8 bits per byte) of memory, and thus floating point numbers for a microcomputer which includes this device are stored in 4 consecutive bytes of memory. Observe that the leftmost bit is used to indicate the sign of the mantissa, the next 23 bits are used for the mantissa and the last 8 bits for the biased exponent. Let us understand what we mean by a biased exponent. In some of the previous examples on floating point numbers, we have seen that exponents are negative when the numbers are less than 1. One way of taking care of negative and positive values in the 8 bits reserved for the exponent would be to reserve one of the 8 bits as a sign bit and thus, in 8 bits, allow the exponents to vary between - 128 and-+127. This method is not used very often. Instead, we use an equivalent system whereby the true exponent is aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 50 DATA REPRESENTATION take the log of, say, 583.422 and then the antilog, we get 583.400 as the answer. The answer will not be wrong, but it will be accurate only to 4 places. Similarly, given a decimal number of many digits, we shall be able to accurately represent in binary as many places as can be accommodated by the number of bits we have kept for our mantissa, 23 in our case. Therefore, we wish to know, how many decimal digits is 23 bits equivalent to. Again we use: 273 = 10° and obtain: 2=69 Thus, in 23 bits, we can be certain of about 6.9 digits accuracy. 6.9 digits of accuracy only gives us an upper limit on the number of digits that may at all be accurately represented. It is possible that some numbers of six or less digits may not be accurately represented. For example, 0.4, as we have already noted, cannot be accurately represented, but this is an inherent problem of conversion which the user can do nothing about, and 80 we prefer not to concern ourselves with it. An analogy may be given in terms of logarithms again. When we say that the antilog of the log of 583.422 will be correct to four significant figures (and will yield 583.400), we are assuming that the logarithmic tables are correct, but if the tables ate slightly erroneous, we shall still get a four-place accurate answer- that may be 583.5- but this is an error whose existence we shall never be aware of. Example 2.15 It is required to design a fast arithmetic chip that performs arithmetic on large floating point numbers. The chip should operate on 6-byte floating point numbers and permit a range of at least 10*°°. What accuracy can be obtained assuming floating point representation of real numbers? 6 bytes implies a total of 48 bits. 10°° implies that the decimal equiva- lent of 2 to the power of the largest value representable in the bits reserved for the exponent is 10°°. Using 2* = 10%, the largest exponent value is iar = 330. Therefore, +330 must be representable in the exponent bits for the range to be 10-®° to 1099. ‘To represent the range of values -330 to 330, 10 bits are needed. Of course, 10 bits represent -512 to 511, but the minimum allowable +330 cannot be expressed in fewer bits. Therefore, out of 48 bits, 10 bits, are needed for the exponent, and of the remaining 1 is for the sign bit, leaving 37 bits for the mantissa. A 37 bit mantissa means an accnracy of 11 decimal places as aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. $4 DATA REPRESENTATION 1111 0100 (-12) 1111:1111 (-1) 1 1111 0011 (-13) t lost carry ‘We have again obtained the correct answer. Let us now elaborate the reason why the above mentioned method for subtraction by addition of the two’s complement works. We first recall the fact that b+c = 100,.., where c is the two’s complement of 6, and, the number of zeros is equal to the number of bits in 6 (or c). For example, 01000 + 11000 = 100000 Now, we can write a — 6 as: a—b=a+(100.,.0—6)—10 2s complement of 5 where, as usual, the number of zeros in 100...0 is equal to the number of bits in 6. Thus, the only operation we do not perform is the subtraction of 100...0 after adding a and two’s complement of 6. However, due to the finite length of the register, the high order carry automatically drops out resulting in the subtraction of 100...0 For example, if from 0000 1000 we have to subtract 00000101, we first add the two’s complement of 00000101 to 0000 1000. This gives us, 0000 1000 (8) + 1111 1011 (-5) 1 0000 0011 (3) Now we may subtract 1000 0000 from 10000011 and obtain 00000011 (=3). However, a pP need not perform the extra subtraction, the leftmost 1 will automatically drop out owing to the 8-bit length of the registers in which the arithmetic is being done. One significant advantage of using two's complement for number representation is that in arithmetic operations, the signs take care of themselves, no special care is necessary. a aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 58 DATA REPRESENTATION Example 2.20 Let us now add three numbers denoted by p,g, and r, that are given as: p = 0.2541 x 1073 q = 0.1284x 107% andr = 0.1212 x 10° Forming the sum (p+q) +17, we obtain: (pt+o) = 0.3825 x 10-4 (pta)+r = 0.1216 x 10° Instead, if we form the sum p+ (q +1); we obtain: (qtr) = 0.1233 x 10° p+(qtr) = 0.1215 x 10° This implies that the associativity of addition, ie. a+(b+c) = (a+b)+e, for real numbers is not valid for floating point arithmetic when using fixed and finite length registers. With respect tor (order of magnitude 10°), g is so small (10~*) that in the addition-cum-rounding operation of q and r only the first digit of the mantissa of q is able to affect (q +r). (q-+r) is again of magnitude 10°, so that in the addition-cum-rounding operation of p + (¢ +1), p (of order 10-5) is so small that only the first digit of its mantissa is able to affect the final result. But when we add two almost equal quantities p and q, there is no loss of significant digits in addition, and thus significant digits are lost only once, in the addition of (p+q)+r, and not twice, as in the other case. a Example 2.21 Let us add the two numbers z = 0.5211 x 10% and y = 0.1122 x 10-?. First, we shift the smaller of the two numbers to match the two exponents. This gives us 0.00000112 x 10°. Now, adding the two numbers, we get: 0.52110000 x 10° 0.00000112 x 10° a " 0.52110112 x 10° ul zt+y aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 62 DATA REPRESENTATION (9) (2) [4] fs) I oN T EL Space [38] [30] [36] [5] 8 0 8 5 Figure 2.11: Representation of the string “INTEL 8085” using 7-bit ASCII code. All codes are in hexadecimal. is the EBCDIC code (Extended Binary Coded Decimal Interchange Code) and a 7-bit code is the ASCII (American Standard Code for Information Interchange). ASCII codes are available in the 8-bit and 6-bit mode also. All microcomputers which have provisions for character manipulation (i.e. where the user can somehow specify whether a word in the memory is a character and not an integer) use one of these codes. The 7-bit ASCII code is more often used. Table 2.12 lists the ASCII codes. Let us illustrate by an example how a character string can be represented in a typical wC memory. Example 2.25 It is desired to represent the character string INTEL 8085 in the memory. Using the 7-bit ASCII code, we require 10 bytes (including one byte for the space character between INTEL and 8085). Thus, this character string may be represented as shown in Fig. 2.11 (we have written down the contents of each byte using hexadecimal notation for convenience). Oo 2.9 SUMMARY In this chapter we have introduced various number systems and shown how numbers may be represented in a »C memory. We have also learned how arithmetic operations are performed on binary numbers. Many uC applications require handling of strings of characters and therefore we have also shown how characters may be stored in the memory. You may find the material presented in this section useful while reading the remainder of the book, especially the next chapter. EXERCISES 2.1 Convert the following numbers to their decimal equivalents: (a) (11010) (b) (AB60)16 (c) (1011)s (d) (777)a aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Chapter 3 PROGRAMMING A MICROPROCESSOR 3.1 INTRODUCTION In this chapter we present the organization and programming of two pop- ular 8-bit microprocessors, the Intel 8085 and the Zilog Z80. Only those aspects of the organization are presented that are of significant use to the software designer. Other organizational and timing details can be found in Chapter 5. ‘Though we have selected the 8085 and Z80 for illustrating the essentials of machine and assembly language programming, the concepts introduced are general in nature and are found in several other pPs. Thus, after reading this chapter carefully, you should have no difficulty in programming any other P after a quick look at the latter’s assembly language mnemonics or binary opcodes. In Chapters 10 and 11, we dwell once again on concepts of machine or- ganization and programming while discussing the iAPX86 series and 680x0 series microprocessors. 3.2 ORGANIZATION OF THE 8085 3.2.1 Data and Address Busses The 8085 is an 8-bit microprocessor available as a 40-lead plastic ceramic package. As shown in Fig. 3.1, the data bus is 8 bits wide. This implies that 8 bits (1 byte) of data can be transferred to or from the 8085 in parallel. There are eight pins dedicated to transmit the most significant 8 bits of the memory address. The least significant 8 bits of the address are transmitted on the eight lines on which data is transmitted. Thus, data and part of the address, are transmitted over a set of shared lines. This is known as address multiplezing. Details of this technique are given in Chapter 5. It is obvious that the data and address (least significant 8 bits) are transmitted at different points in time. Due to this multiplexing, the 8085 bus is also referred to as multiplezed bus. Thus, the 8085 has a 16-bit address transmission capability. This implies that a total of 21° (=65536) memory locations can be addressed directly 66 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 70 PROGRAMMING A MICROPROCESSOR Operation code Figure 3.3: Single-byte instruction. Pint Byte Data/ Address Second Byte Figure 3.4: 2-byte instruction. The two bytes are stored at consecutive addresses. Pint Byte Low order byte of data/address | Second Byte High order byte of data/address Third Byte Figure 3.5: 3-byte instruction (all the three bytes are stored consecutively in the memory). 3.3.1 Instruction Types An instruction may be 1, 2 , or 3 bytes in length. In any of these three types of instructions, the first byte indicates the operation to be performed. The second and third bytes, if present, contain either the operand or address of the operand on which the operation is to be performed. Figs. 3.3, 3.4, and 3.5 show alll the three types of instructions. 3.3.2 Classification of Instructions For the convenience of programmers, the 8085 instructions have been clas- sified into the following five groups: . Data transfer group. . Arithmetic group. . Logical group. . Branch group. . Stack, I/O and Machine control group. gpwene The binary operation code and the corresponding mnemonic for each instruction is listed in Appendix A. The explanation of the operation of most of these instructions is given in the next section and, a few, in the exercises. Some general aspects of these instructions are explained below. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 74 PROGRAMMING A MICROPROCESSOR Example 3.4 It is desired to add the number 5 to the contents of memory location OAB12H and store the result in location OFAOFH. Assuming that the sym- bolic address for 0AB12H is Z, we may use the following instruction se- quence to perform this task. LXI H, OFAOFH =; Load register pair H-L with OFAOFH. LDA 2 } Get value of Z in ACC, ADI 5 ;Add 5 to it (3.7) MOV MA ; Store ACC in memory location. ; Pointed to by register pair H-L (i.e. to jlocation OFAOFH) In (3.7), first we use an LXI instruction to put the address OFAOFH in register pair H-L. The next three instructions in (3.7) fetch the desired contents, add 5 to it, and store the sum in location 0FAOFH using the MOV instruction. The MOV instruction in (3.7) uses register indirect addressing. ‘We may write (3.7) in machine language as: 0010000100001111 11111010 ; LXIH, OFFAH 0021101000010010 10101011 ; LDAZ (3.8) 1100011000000101 ; ADIS 01110111 ; MOVM,A Note that instructions using immediate addressing may be 2 or 3 bytes long. o Implicit Addressing There are certain instructions that operate only on one operand. Such instructions assume that the operand is in the ACC and therefore require no address specification. Many instructions in the logical group like RLC, RRC, and CMA fall into this category. All these are 1 byte instructions, You may observe that those instructions that specify the address of one of the operands using one mode or the other, use implicit addressing for the otker operand. Example 3.5 It is desired to complement the contents of memory location 5992H. This may be done by the following instruction sequence given below. a aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 78 PROGRAMMING A MICROPROCESSOR 3.4.2 Machine Language Programming We shall now introduce machine language programming for the 8085. Be- cause of its tedious nature, only very simple programs will be presented. We shall also explain how a program is executed by a microprocessor, in- struction by iistruction. We begin with a simple example. Example 3.6 The memory locations 0800H, 0801H, and 0802H contain one 8-bit integer each in the two’s complement form. Let us denote the contents of these locations by X, Y, and Z, respectively. We shall develop a machine language program that adds X and Y, subtracts Z from the sum, stores the result in register B in the 8085 and sends it for display. We first develop an algorithm depicting the sequence of operations to perform the desired computation. Algorithm 3.1, written in plain English, exhibits this sequence. Observe the style in which this algorithm has been presented. Algorithm 3.1 Input: Three single-byte two's complement integers X, Y and Z in locations 0800 and 0802 respectively. Output: A single-byte integer equal to (X + Y - Z) stored in register B and displayed on the output device (eight lamps in our case). Method: Load the address of Y in register pair H-L. Load ACC with X (ie. the contents of 0800). Add the contents of location pointed to by H-L to ACC. Load the address of Z in H-L. . Subtract the contents of location pointed to by HL from the ACC. 6. Copy ACC contents into register B. 7. Output ACC contents to the output device. 8. Halt. geepe Using the opcodes for the 8085 we can now code Algorithm 3.1 into a machine language program. Program 3.1 is the desired coding. Note the two headings LOC and INSTRUCTION in this program, Under the heading INSTRUCTION, instructions are written. Whatever is written under LOC is explained later in this example. We have written a trivial algorithm. However, as we proceed further in this text, we shall encounter more complicated problems resulting in non- trivial algorithms. Furthermore. Algorithm 3.1 is heavily dependent on the aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 82 PROGRAMMING A MICROPROCESSOR Labels The label field contains a label. A label could be any character string consisting of lower and upper case letters (A-Z or a-z), digits (0-9)and the $. The first character of the label must be a letter or a $. Some assemblers place an upper limit on the number of characters in a label. HERE, $NEXT, and P0124 are the label used in the above three instructions. A label is separated from the mnemonic by a colon. Similarly, any operand is separated from the mnemonic by at least one space character. ‘The comment is separated from any operand by a semicolon, Operands An explicit operand required by an instruction could be specified in one of several ways. In the following instruction, there is no operand: cMC ; This is instruction complements the jearry flag. It requires no explicit operand. ‘The next instruction specifies two register operands using a letter to denote each register. MOV A,B ; Move contents of register B to register A. A symbolic address can be used as an operand to specify a memory location as in the next example. LDA TEMP ;Get the contents of memory ;location TEMP. Note that the precise memory address of the label TEMP is determined by the assembler. When using a symbolic address, one may also use simple arithmetic expressions in the operand. For example, the next instruction fetches the contents of memory location TEMP-+1 into the accumulator. If TEMP corresponds to the memory location 3990, then the contents of location 3991 are fetched. LDA = TEMP+1__; Get the contents of memory ;location TEMP+1. Note that only expressions of the type X+ ¢ are allowed as operands, where X denotes a label and c is a constant. Thus, if P and Q are two labels, then P+Q is an invalid operand. In instructions requiring immediate operands, the immediate operand can-be specified using a constant, as in the following examples. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 86 PROGRAMMING A MICROPROCESSOR Note that 1100 may look like a binary constant but, for the assembler, it will be a decimal constant unless it has B as the suffix. We have to be careful while writing hexadecimal constants. A hexadec- imal constant must be preceded by a 0 if it starts with any letter A-F. For example, AFH is not a hexadecimal constant. The assembler will treat AFH as a symbol, not as a constant. To force the assembler to treat AFH as a constant, it should be written as OAFH. Reversing the constants Constants can fit in one or two bytes. When a constant is 2 bytes long, the least significant byte goes in the numerically lower address. The most significant byte goes in the numerically higher address. Thus, for example, suppose that the hexadecimal constant 45AFH needs to be stored in two bytes in the memory starting at address 5000. The least significant byte, which is AF, will be stored in the byte at address 5000. The most significant byte, namely, 45, will be stored in the byte at address 5001. We assume that the assembler automatically places the constants in the correct order in the memory. However, not all assemblers do this reversal. You must therefore check in the manual that describes the assembler that you will be using. As an example, in the directive: ORG 8000 T: DW OFAOFH the least significant byte OF will be stored at location 8000 and the most significant byte FA will be stored at location 8001. Note that this conven- tion of storing a constant in memory is used in both the 8085, 8086, and the 280. However, in the 68000 and several other pPs, the opposite convention is used. 3.6 LANGUAGE FOR WRITING ALGORITHMS In order to write the algorithms in an expressive manner, we shall use a lan- guage similar to English. Those who are familiar with languages like Pascal or C will find it easy to understand our language. However, this language is for those who are not familiar with any similar high level programming language. Each algorithm that we write will consist of a sequence of steps. Often, these steps would be numbered 1,2,3,... and so on. We shall also call each step a statemeni. Each step (or statement) can be of one of the types described below. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 90 PROGRAMMING A MICROPROCESSOR that Algorithm 3.1 was developed using statements written in plain English. It should also be noted that steps in an algorithm are executed in a sequence beginning from step 1, followed by step 2 and so on. We shall now present a series of examples to illustrate different features of the 8085 microprocessor. Most of these features are characteristic of other microprocessors as well. In cases where the program is trivial, we directly write the assembly program. However, in nontrivial cases, such as programs involving loops, we shall first develop a suitable algorithm and then code it into an assembly program. 3.7 PROGRAMMING EXAMPLES We now present a sequence of examples designed to illustrate the basic concepts of as sembly language programming. Through these examples, we introducé a variety of 8085 instructions. Example 3.8 In many applications, we are given one byte of data and it is required to mask specific bits within this byte. To be more precise, consider that a byte of data is available in the ACC. It is desired to mask out bits 0-3 and retain the remaining bits as they are. This problem can be solved using the ANI instruction. The ANI is the mnemonic for the AND-immediate instruction. It logically AND’s a byte of immediate data with the contents of the accumulator. The result is left in the accumulator. The question now is: with what should the ACC be AND’ed so as to mask out bits 0-3? We need to construct a mask. A mask is a bit pattern that is used for zeroing out certain bits of data selectively. For our purpose, the mask can be: The rightmost four bits of the mask are 0. Hence, when the mask is AND’ with any other byte, it will force bits 0-3 of the data byte to be zeroed out. Our one line program now becomes: ; This instruction can be used to mask out the rightmost 54 bits of the data in the accumulator. ANI 1111 00003 If the contents of the ACC before the ANI instruction is executed, are: 10101010, then after the ANI masks out the least significant four bits, the ACC will contain the bit pattern: 10100000. o aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 92 PROGRAMMING A MICROPROCESSOR Example 3.10 Through this example we illustrate how the rotate instructions can be used to perform simple arithmetic operations. Suppose we want to double the contents of register B. From our knowl- edge of the binary number system, we know thet shifting a number left by 1 bit doubles it. Thus, we can use the RAL instruction for doubling the number. However, the RAL instruction operates on the contents of ACC. To take care of this requirement, we can first move the contents of register B to ACC, double the ACC contents, and move the new value back to reg- ister B. Here is the desired program. Note how we have used the CY flag to get the correct answer. ; This instruction sequence doubles the number in register B. STC Set CY tol. CMC ; This makes CY 0. MOV A,B ; Get contents of register B into the ACC. RAL Rotate ACC left by 1 bit. ; This doubles the ACC contents. MOV BA ; Register B has now been doubled. We know that the RAL instruction rotates the ACC left through the carry flag. Thus, the CY bit moves to the least significant bit of the ACC. We do not know what is the CY bit before we shift the ACC. Therefore, we set it to 0, using the STC and CMC instruction sequence. Now, when the RAL instruction is used, a 0 gets into the least significant bit of the ACC. This is precisely what we want! Suppose that the contents of register B just before the above instruction sequence is executed, are 00001001 (=9 in decimal). The RAL instruction will change it to 00010010 (=18 in decimal). What if we do not set CY te 0 before the RAL instruction? Will contents of register B be doubled? ‘The answer is: it may or may not be doubled. It will be doubled if the CY flag is 0 before RAL is executed. But if CY is 1 then the result will be one more than the doubled value. You may verify this statement. a Using an instruction sequence similar to the one described in the above example, it is easy to halve a value. Recall that shifting to the right a bit pattern by 1 bit, reduces its value to one-half of the original value. Example 3.11 Through this example, we shall learn how the rotate instructions of the 8085 can be used to simplify a more cornplex multiplication problem, The problem is to multiply a number in the ACC by 10. We assume that the result of multiplication is less than 128 and hence can be represented in PROGRAMMING EXAMPLES 93. 8-bits of the ACC. We can solve this problem in at least two ways listed below. 1. Write a general purpose multiplication program and use it to multiply the ACC with 10. 2. Use the fact that 10 is the multiplier and simplify the problem. We shall use the second of the above two methods. The first method is illustrated later in Example 3.13. Let X denote the number in the ACC. Our program needs to compute X * 10. We know that: X#10=X*(84+2)=X484+ X42 What is special about the above equality? The speciality is that we have replaced one multiplication by 10 by two less ezpensive multiplications, by 8 and 2, and one addition. Both 8 and 2 are powers of 2 (8 = 2° and 2 = 2") and hence multiplication by these numbers is less expensive in terms of time to execute the program (see Exercise 3.24). Given a binary number, we have seen earlier that shifting it left by 1 bit multiplies it by 2. Thus, to multiply X by 8, we need to shift it left by 3 bits. Similarly, to multiply X by 2, we need to shift it left by 1 bit. This observation now leads to the following program to multiply X by 10. ; This program multiplies an 8-bit number in the ACC ; by 10 using the rotate instructions of 8085. sTC ;Set CY flag tol. cmc ; Complement CY so that it becomes 0. RAL ; Rotate ACC left by 1 bit through CY. j We now have X*2 in the ACC. MOV BA ; Let us save it in register B. stc ;Once again reset CY to 0 as the bit CMC jshifted out by RAL might have been a 1. RAL 3; We now have X*4 in the ACC. sTc ;Once again reset CY to 0 as the bit CMC ;shifted out by the second RAL jmight have been a 1. RAL j Now we have X*8 in the ACC. ADD B ; Add X*2 to X*8. We now have the jresult in the ACC. In the above program, we used each of the STC and CMC instructions three times. We can replace the combination of STC and CMC by the instruction ANI OFFH. This will reduce the above program by three instructions. Note that this ANI instruction will not affect the contents of the ACC, but it will reset the CY flag to 0. The above program can be rewritten as shown below. 94 PROGRAMMING A MICROPROCESSOR o ;This program multiplies an 8-bit number in the ACC jby 10 using the rotate instructions of the 8085. ANI OFFH ; Clear the CY fiag to 0. RAL ;Rotate ACC left by 1 bit through CY. ; We now have X*2 in the ACC. MOV BA jLet us save it in register B. ANI OFFH ; Clear the CY flag to 0. RAL ;We now have X*4 in the ACC. ANI oFFH ; Clear the CY flag to 0. RAL ; We now have X*8 in the ACC. ADD B 3 Add X*2 to X*8, We now have the result sin the ACC. The following examples introduce more complex programs than the ones introduced above. Through these examples, we also illustrate how different branch instructions of the 8085 can be used. Example 3.12 Let us write an assembly program that finds and displays the maximum of three L-byte integers. We denote these integers by P, Q, and R. These symbols can be thought of as symbolic addresses of the integers in the memory. Note that while writing an assembly program, we need not worry about the actual memory addresses. Algorithm 3.2 exhibits the sequence of steps to be executed for solving this problem. Observe that this algorithm is not written using machine level operations (e.g. load the ACC with a value). Instead, we have written it using simple conditional expressions and English-like statements (e.g. the if-then-else statement). You are urged to carefully examine step 2 in Algorithm 3.2. Algorithm 3.2 Input: ‘Three 1-byte integers P, Q and R. Output: ‘The maximum of these three displayed on the output device. Method: 1. Define the values of P, Q and R. 2. fP > Q then begin ifP >R then begin output (P) end else begin PROGRAMMING EXAMPLES 95 output (R) end end else begin #Q> R then begin output (Q) end else begin output (R) end end 3. alt Algorithm 3.2 is general in nature as it is independent of any micropro- cessor. It can be easily coded into the assembly language of any micropro- cessor. We shall code it into the 8085 assembly language. Program 3.2 is the desired assembly program for the 8085. We have inserted comments at suitable points in the program to improve its clarity. You should examine each instruction of this program carefully as several new instructions have been introduced (e.g. CMP, JC, and JMP). Program 3.2 ; Program for finding the maximum out of three ; single byte integers P, Q, and R and displaying it. ; Define P, Q, and R P: DB 7 Q: DB 8 R: DB 3 1fP>Q LDA Q ;Get Q into ACC MOV BA ; Move it to register B LDA oR ;Get R into ACC MOV GA ; Move it to register C LDAP ;Get P into ACC CMP ; Compare (ACC) with (B) B Jc CQR ; Jump to CQR if (ACC) < (B) ; then if P > R then output (P) cMP oc ; Compare (ACC) with (C) Jc OTR : 96 PROGRAMMING A MICROPROCESSOR OUT 2 ; Output (ACC) to display JMP FINIS Go to the end of program ; else output (R) OTR: MOV A,c ; Get R into ACC ouT 2 ; Output R to display JMP FINIS ;Go to the end of program jeseifQ>R CQR: MOV A,B ;Get Q into ACC CMP Cc ; Compare it with (C) ie. with R Jo OUTR j then OUTPUT (Q) OUT 2 j Output Q to display JMP FINIS ; else OUTPUT (R) OUTR: MOV A.C ;Get R into ACC ouT 2 ; Output R to display FINIS: HLT j Halt the 8085 END 5 End of assembly program Before this program can be executed by the 8085, it has to be translated into a machine language program either by hand or by an assembler. In either case, suitable addresses shall be assigned to each symbol introduced in the program and to the instructions. The first three instructions in Program 3.2 are not meant for the 8085 (verify that they do not appear in ApBendix A). These are known as pseudo instructions or assembler directives. Using such directives, the programmer requests the assembler to perform certain tasks. The DB directive causes the assembler to reserve one byte and initialize it to a value. For example, P:DB 7 will cause the assembler to reserve 1 byte in the memory, with symbolic address P and initialize it to the value 7 (its binary representation). Note that a pseudo instruction does not cause any machine instruction to be generated. Another pseudo instruction that appears in Program 3.2 is the END instruction. It is always the last instruction of an assembly program. It serves to identify the physical end of the program. Oo PROGRAMMING EXAMPLES 97 Example 3.13 ‘The 8085 does not have a multiply instruction. Therefore, when a pro- grammer desires to multiply two integers, he has to devise his own method for doing so. In this example, we shall develop a program that takes two natural numbers X and Y as input (X,Y > 0) and produces their product, denoted by PROD, as its output. The output is displayed. A simple method for multiplication adds X to PROD, Y times with the initial value of PROD being 0. This method is exhibited in Algorithm 3.3. The algorithm first initializes variables PROD and P to 0 and Y, respec- tively. Then it repeatedly adds X to PROD and reduces Z by 1 until the value of P becomes 0. At the end of this process, the product is displayed and the algorithm halts. We can now code Algorithm 3.3 into an assembly program for the 8085. To do so we assume that we shall accumulate the sum in the accumulator which implies that the ACC acts as PROD. Also, we shall use the B and C registers to hold the values of P and X, respectively. Program 3.3 is the desired coding of Algorithm 3.3. Algorithm 3.3 Input: Two natural numbers X and Y. Output: ‘The product of X and Y displayed. Method: . Define the values of X and Y; PROD := 0; P= Y; repeat Beye in PROD := PROD + X; Pi=P-1; end until (P = 0); 5. Output (PROD); 6. Halt. In Program 3.3, note that the DCR B instruction decrements the con- tents of register B by 1 and sets flag Z (the zero flag) to 1. The jump instruction, which follows, tests this flag and causes the microprocessor to resume execution from the instruction labeled LOOP if the flag is not set to 1. Program 3.3 will not produce the correct result if the product of X and Y exceeds a number that can be represented in two’s complement form in 8 bits. See Exercise 3.23 for an algorithm which accumulates the sum in 16 bits. 98 PROGRAMMING A MICROPROCESSOR Program 3.3 Input: Two natural numbers X and Y. Output: Product of X and Y displayed. ; Define X and Y X: DB rb Y: DB 4 jInitialize PROD, B-reg. and C-reg, to 0, Y, and X, respectively. LDA Y ;Get Y into ACC MOV BA ; Move it to register B LDA XX ;Get X into ACG MOV GA jMove it to register C MVI A, 00H ; Move 0 to ACC (‘H? denotes jhexadecimal constant) 3 repeat j add X to PROD; ie. reg C to ACO and j decrement Pic. reg. B LOOP: ADD ; Add (C) to ACC DCR B ; Decrement (B) by 1 j until P =0 JNZ LOOP ;If (B) # 0 then repeat the process. joutput (PROD) and halt our 2 ; Output ACC to display HLT ; Halt END Program 3.3 can be modified to detect overflow and display a suitable message. Exercise 3.15 requests such a modification. Example 3.14 We shall now solve a problem that requires the use of instructions from the logic group. The problem is to write a program to count the number of 1’s in a given byte. The program should, as an output, display this count (its binary representation is to be displayed). A rather simple method for doing so appears in Algorithm 3.4. ‘The algorithm uses a variable COUNT to keep a count of how many 1’s it has already encountered while oxamining each bit of the given byte. Variable NBIT is used to keep count of the number of bits that remain to be examined. Evidently, the initial values of COUNT and NBIT are 0 and PROGRAMMING EXAMPLES 99 8, respectively. Using the RLC (Rotate accumulator Left ) and the JNC (Jump if No Carry) instructions we have coded Algorithm 3.4 into Program 3.4. The program uses registers B and C for COUNT and NBIT, respectively. Algorithm 3.4 Input: A byte of data Output: Number of 1’s in the given byte displayed Method: 1. Define the given byte, and denote it by X 2. COUNT := 0; NBIT := 8 3. repeat « if leftmost bit of X = 1 then COUNT := COUNT +1; . Modify X such that it is effectively shifted left. by 1 bit; . NBIT := NBIT-1; until (NBIT = 0); 4, Output (COUNT); 5. halt, Program 3.4 Input: _A byte of data denoted by X. Output: Number of 1’s in X displayed. Method: ; Define X x: DB 00111101B — ; Define X as an 8-bit binary constant ; Initialize COUNT (register B ) and NBIT (register C) MVI B, 00H ;Move 0 to reg. B MVI C, 08H ; Move 8 to reg. C LDA x ; Bring X to ACC for jecanning each bit } repeat 3 tfleftmost bit of X = 1 LOOP: RLC ; Rotate ACC left (bring leftmost ; bit into CY) INC SKIP ; Do not increment COUNT if CY =0 ; then COUNT := COUNT +1 INR B ;Add 1 to COUNT (reg. B) SKIP: DOR © ; Decrement NBIT (Reg. C) by 1 INzZ LOOP ; Repeat the process again 100 PROGRAMMING A MICROPROCESSOR MOV AB ;Get COUNT (reg. C contents) ;into ACC. OUT 2 ;Send it to display HLT ; Halt END Observe how X has been defined in Program 3.4. A ‘B’ at the end of a bit pattern indicates that a binary constant is being defined. An ‘H’ at the end of a number denotes that the number is to be treated as a hexadecimal number. A decimal number is implied if no such character is appended to a constant, Whatever be the type of constant— binary, hexadecimal, or decimal— the assembler always translates it into binary. a 3.7.1 The Stack The stack is a sequence of memory locations set aside by a programmer for use in a particular fashion. Data has to be stored in the stack on a last-in- first-out (LIFO) basis. Thus, only two operations are defined on the stack - the push and pop operations. The stack always has a unique location known as the stack top. A special 16-bit register in the microprocessor designated as SP (Stack Pointer), holds the address of this location. Fig. 3.7 illustrates a typical stack configuration. The next two short examples illustrate the pop and push operations as carried out by the 8085 microprocessor. Example 3.15 Assume that the stack begins at memory location 2000. The bottom four locations are already full (ie. contain some data). This situation appears in Fig. 3.8(a). Now, if the instruction PUSH B is executed by the 8085, the contents of register pair B-C will be pushed on top of the stack. In Fig. 3.8(b) note that the contents of register B have been copied to location (SP-1) and that of register C to location (SP-2). The SP now points to a new stack top that is two locations above the previous stack top. Using the PUSH instruction, the contents of any register pair may be pushed into the stack. o Example 3.16 In order to illustrate the POP operation, we assume the stack and SP contents to be as shown in Fig. 3.9(a). Now, if the following instruction is executed by 8085 then the top 2 bytes of the stack would be popped out into register pair B-C as shown in Fig. 3.9(b). TheSP now points to the new stack top. PROGRAMMING EXAMPLES 101 496 497 a STACK TOP LOCATION 499 SP 500 Figure 3.7: A typical stack; the stack pointer, SP, points to an address that is one more than the next available location on the stack (500 and 499, respectively, in this figure). a The stack pointer is normally initialized by the programmer at the be- ginning of a program which uses a stack. This can be done by either the LXI, XTHL or the SPHL instructions. XTHL causes the contents of register pair H-L to be exchanged with stack top contents. The SPHL instruction causes the contents of register pair H-L to be transferred to the SP, thereby replacing the old SP contents. The LXI instruction loads an immediate 16-bit value into the SP. Any area of memory can be used as a stack. The microprocessor does not put any restriction on the location of the stack in the memory. The maximum size of the stack is obviously limited by the amount of memory available in the microcomputer. Most microprocessors provide a stack pointer for the maintenance of a stack in the memory. Such a stack is also known as an ezternal stack. In some microprocessors, for example Motorola 6800, the microprocessor itself contains a few registers that are used as stack. Such an internal stack is mostly limited to a few locations (typically less than ten). The advantages of having an external stack over the internal stack, and vice-versa, shall be brought out in Chapter 5. ‘The stack is an invaluable tool at the disposal of a programmer. One use of the stack is mentioned in the next section on subroutines. For another use, see Exercise 3.29. 102 PROGRAMMING A MICROPROCESSOR PUSH B REG.B REG. Cc 1996 [ NEXT AVAILABLE LOCATION os REG. 8 REG. 1994 jag NEXT AVAILABLE LOCATION won Figure 3.8: (a) Stack before the PUSH operation (b) Stack after the PUSH operation. 3.7.2 Subroutines While designing a nontrivial program, it may so happen that one operation is to be used at different places within the program, operating on different parameters. If quite a few instructions, say more than 5, are required to realize this operation, then these instructions would have to be written at each of those points in the program where the operation is used. A simple example of such an operation is multiplication which may have to be used in an assembly program for the 8085 at many places. As the 8085 does not have any instruction for multiplication, the sequence of instructions required to perform this operation has to be inserted at each of these places. This would certainly be a waste of memory space (as instructions occupy memory). The concept of a subroutine can be used effectively to avoid repeating the same code over and over in a program. A subroutine is a program that definés an operation, e.g. multiplica- PROGRAMMING EXAMPLES 103 REG. B REG. c 348 Lf NEXT AVAILABLE LOCATION wo] sr Per | asf sof | sr o [ee] Figure 3.9: (a) Stack before The POP operation (b) Stack after the POP operation. tion of scalars, inversion of a matrix, printing a line of 120 characters on a printer, etc. It is written in such a way that wherever needed, it may be called with suitable parameters. For example, a subroutine for the multi- plication of two integers will have two input parameters— the numbers (or their addresses) that are to be multiplied and an output parameter which is the result of multiplication. Thus, if an operation is to be performed at several places within a program, just one instruction may be used to call the subroutine. On execution of the call instruction, the subroutine execu- tion will begin. At the end of subroutine execution, the execution of the program which had called this subroutine would resume from the instruc- tion immediately following the call. Fig. 3.10 illustrates this call-return structure. ‘There are two instructions provided in the 8085 that are useful for writ- ing and using the subroutines. These are the CALL and RET instructions. Almost all microprocessors provide similar instructions for subroutine im- 104 PROGRAMMING A MICROPROCESSOR CALLING PROGRAM SUBROUTINE FAST INSTRUCTION LAST INSTRUCTION Figure 3.10: Dlustration of call-return structure of subroutines. plementation. In the two examples that follow, we shall explain the use of these instructions in writing subroutines. Example 3.17 Assume that a subroutine named MULT has been written and is stored from the memory location 08AFH onwards. Thus, the address of the first instruction to be executed, known as the entry point address, is OBAFH. This subroutine may be called, from any other program, by the following assembly language instruction. CALL MULT ‘The binary equivalent of this CALL instruction is 11001101 10101111 00001000 This is a 3-byte instruction using the last 2 bytes for the address of the subroutine to be called. We assume that the above CALL instruction is stored in the memory in the locations 091AH, 091BH, and 091CH. When this instruction is exe- cuted, the contents of program counter (PC), ‘ie. 091DH, that constitute the address of the instruction immediately following the call, will be pushed on to the stack. The microprocessor will pick up the next instruction for execution from address 08AF, implying thereby the beginning of the exe- cution of the subroutine. PROGRAMMING EXAMPLES 105 Within the subroutine when the RET instruction is executed, the stack top contents will be popped out and assigned to the PC. As the micro- processor takes its next instruction from the address specified by the PC, the execution of the calling program would now resume from the instruc- tion immediately following the CALL instruction. The call-return sequence described above is also illustrated by Fig. 3.11. a Example 3.18 We shall write a subroutine named MULT that multiplies two 1-byte natu- ral numbers and outputs the product. We shall also write a calling program that uses MULT to evaluate the following expression: PeQtReS and display the result (assume that P,Q, R, and $ denote natural numbers). We assume that before MULT is called the addresses of the two input numbers are stored in register pairs H-L and D-E. MULT leaves its out- put in the ACC before it returns to the calling program. We shall use the method of repetitive addition for multiplication. Recall that we have already described this method in Example 3.13. As the algorithm embody- ing this method has already been presented, we directly write the program. Program 3.5 is the desired assembly program that consists of two parts— the subroutine MULT and a main program for computing the value of the expression given above. a Program 3.5 ; This program evaluates the expression P*Q+R*S for the given values ; of P, Q, R and S, each being a natural number. ; It is assumed that there shall be no overflow during ; expression evaluation. ; Subroutine to multiply two natural numbers. ; It is assumed that before this subroutine is invoked, the j addresses of the natural numbers to be multiplied are in j register pairs H-L and D-E. ; Define P, Q, R, and 8. P: DB 3 Q@ DB 9 R: DB 2 106 PROGRAMMING A MICROPROCESSOR JUST BEFORE EXECU. MMKEDIATELY AFTER wMEDUATELY AFTER THON OF CALL, EXECUTIONOE CALL RETURN PROMGALL Figure 3.11: Sample CALL-RETURN sequence. Ss: DB 4 ; Reserve 1 byte of memory for temporary storage. ; Get data into C and B registers. TEMP: DS 1 MULT: MOV OM ; Bring value pointed at by H-L (say X) jto reg. C. XCHG ; Exchange (H-L) and (D-E). MOV BM ;Bring value pointed at by D-E to reg. B. XCHG ; Restore original values of H-L and D-E. ; Initialize ACC before accumulating the sum. MVI A, 00H j repeat ;add X to ACC; ie. reg. C to ACC and decrement (B) LOOP: ADD ¢ ;Add (C) to ACC pcR B ; Decrement (B) by 1. ; until (B) = 0 INZ LOOP ;Repeat addition if (B) is not 0. RET ;return to calling program; product. jremains in ACC. ‘PROGRAMMING EXAMPLES 107 ; Program to evaluate the given expression. ; Initialize H-L and D-E with addresses of P and Q. ; respectively and multiply. START: LXI SP, 0E000H _; Initialize stack pointer . LXI H, P ; Get address of P into H-L. LXI D,Q ; Get address of Q into D-E. CALL MULT ; Multiply P and Q. STA TEMP ; Store result temporarily in TEMP. ; Initialize H-L and D-E with addresses of R and S ; respectively and multiply. LXI HR ;Get address of R into H-L. LXxI DS ; Get address of S$ into D-E. CALL MULT ; Multiply R and S. MOV BA ; Move product (R*8) to reg. B. ; add (P*Q)+(R*S). LDA ‘TEMP ;Get (P*Q) into ACC. ADD B ; Add (R*S). HLT END START Example 3.19 We shall write a subroutine named SERCH which takes the following in- puts: 1. a sequence of N bytes each containing an integer, 2. value of N, and 3. an integer denoted by X ‘The subroutine will search for X in the sequence of N bytes, and if found, it will set FOUND equal to 1. If not found, then FOUND will be set to 0. We shall first write an algorithm exhibiting the procedure to perform the task mentioned above. Then, we shall code the algorithm into an assembly program written for the 8085. For the purpose of our algorithm, we shall use symbol A to denote'a sequence of N bytes, also called an array of N bytes. Thus, A (i) denotes the i'* element in the array, A(1) the first element, A(K+1) the (K+1)” element and so on. ‘The search procedure is quite simple and is exhibited in Algorithm 3.3. The algorithm compares the value of X with each element of A starting with A(2). If the two compared values are the same then FOUND is assigned the, value 1. Step 4 in Algorithm 3.3 performs this comparison. Note that all the elements of A are compared with X according to this step. The triviality of this algorithm is easy to notice. 108 PROGRAMMING A MICROPROCESSOR Algorithm 3.5 Input: 1. A, an array of N bytes. 2. N, the number of elements in A. 3. X, a byte-long integer which is to be searched for in A. Output: 1. FOUND, the value of FOUND is set to 1 if X is in A, otherwise it is set to 0, Method: . Define N, A, and X. 2. Set I= 0. We use I to indicate the number of elements of A with which X has been compared. Thus, (I+1) points to the next element of A with which X is compared. . Set FOUND = 0. We initialize FOUND to 0 to indicate that X has not yet been found in array A. 4. whilel < N do begin T:= 141; ifX = A(I) then FOUND = 1; end 5. Output (FOUND); 6. Halt. o Instead, the comparison should stop as soon as X matches an element in A. Thus, we may rewrite step 4 of Algorithm 3.3 as shown below: 4. while (I < N) and (FOUND # 1)do begin T= 4; ifX = A(D) then FOUND := 1; end As we have the algorithm ready, let us embark upon the task of coding it into a subroutine named SERCH. It should be obvious by now that this subroutine will be used later for searching for an element in an array. It would do so only when it is called. Thus, we should first decide how the parameters would (input and output) be transferred to and from this subroutine. SERCH is called by the statement: CALL SERCH The calling program would perform the following tasks: 1. place the address of the first element of the array in H-L pair, 2. place the value of N, the number of elements in the array, in register B, and PROGRAMMING EXAMPLES 109 3. place the value of X, which is to be searched for, in register C. ‘The subroutine will return the output value in register D. Thus, FOUND in Algorithm 3.3 will be represented by register D in SERCH. With the parameter transfer conventions established we can now begin coding Algo- rithm 3.5. An examination of Algorithm 3.5 reveals that the only non-trivial step in terms of complexity is step 4. In order to translate this step, we need to determine 1. the representation of I, and 2. how to access the I'* element of A, i.e. how to evaluate A(I). We may solve these two problems in several ways. One of these is described here and the others are left for you to solve. We have already decided to use the H-L pair to point to the first element of the array just before the subroutine execution begins. Now, as soon as the first element has been used by the subroutine, that is compared with X, the value in the H-L pair can be incremented by 1 so that it automatically points to the next element. Again, after the next element has been used, the H-L pair can be incremented. However, a count has to be kept so that we can stop the comparison process after N elements have been compared. In Algorithm 3.3, I denotes this count. Instead, of initializing a register (representing I) to 0 and in- crementing and comparing its value with N (as in step 4), we can also initialize a register with N and decrement it by 1 each time a comparison is made. When the value in this register reaches 0, the comparison stops. The comparison could also stop if FOUND (that is contents of register D) is 1. Note that in step 4 of our algorithm, I is being used for counting the number of comparisons and for indexing A to determine the next element to be compared. However, as we are using the H-L pair to point to the next element in A, indexing is automatic. Further, we use a register for counting (down). The program that results from all these considerations is given as Program 3.6. Examine this program carefully as this is the first one in which an array is used. In all the previous programs we had used only scalar data types, e.g. integer, whereas an array. is a non-scalar data type. Program 3.6 ; This subroutine searches for an integer argument X (placed ;in register C) in an array of integers stored from some j location onwards in the memory (this starting location is jtaken from H-L pair ). The number of elements to be searched is j assumed to be in register B. If X matches one of these jelements, register D is set to 1, otherwise it is sct to 0. 110 PROGRAMMING A MICROPROCESSOR SERCH: MVI OD, OH Initialize FOUND (reg. D) to 0. jcheck if FOUND # 1 (reg. D # 1). LOOP: MOV A,D ; Move contents of D to ACC. CPI 1 ;Compare with 1. It OUT ;If same, return to caller. MOV AM ; Get next element of A into ACC. cMP oC ;Compare with X (contents of reg. C). INZ INCR, ;Go to increment H-L pair. MVI D, 01H ;Set FOUND = 1 (reg. D = 1). ; JMP INCR ;Go to increment H-L pair. INCR: INX H ; Increment H-L to point to the next element jof the array. DCR B ; Decrement B to reflect the fact that one ; more comparison is over. INZ LOOP ;If not zero then repeat the process OUT: RET jelse, return We shall now write a main program which calls the SERCH subroutine to search for a given value in an array of five values. This program appears in Program 3.7. In fact, before executing Program 3.7, both Program 3.6 and Program 3.7 should be placed one after the other and then translated inte a machine language program. Note how the array has been defined in Program 3.7. Symbol A has been used as a label for the first element. This implies that addresses of the successive elements are A+-1, A+2,... and so on. When the LXI H, A instruction is executed, the address of A, that of the first array clement, gets loaded into the H-L pair. o Program 3.7 ; This is a mainline program which simply defines an array A jof N integers and an integer value X and uses the SERCH jsubroutine to find if X occurs in A. A 0 or a 1 is sent jout to display before the program halts. ; Define N, X, and array elements X: DB 7 Ne DB 5 A: DB 3 DB 2 DB 7 DB 4 DB 8 ;Set up parameters and their addresses in various registers THE ZILOG 280111 ; according to the convention established earlier. BEGIN: LXI SP, OEO00H LXI HA j bring address of A, i.e. the first jelement in H-L pair. LDA oN ;Get length of array A. MOV BA ; Move it to reg. B. LDA X :Get search argument. MOV GA ; Move it to reg. C. CALL SERCH ,Call the SERCH subroutine. MOV AD ;Get FOUND into ACC. oUT 2 ;Send it for display. HLT END BEGIN The starting address of the program (also known as the execution add- ress) may be specified in the END instruction as in Programs 3 and 6. Note also that before the program is executed, the stack pointer must be initialized with a suitable value. 3.8 THE ZILOG 280 ‘The 8085 from Intel and Z80 from Zilog have both been very popular mi- croprocessors in the 8-bit arena. Both of them have the same ancestor, the Intel 8080. However, both of them are quite different in their design. In this section we shall introduce the Z80 architecture. We shall also highlight those features of the Z80 that make it different from the 8085. It is worth noting the fact that any program written for the 8085 will execute without modification on the Z80. This is under the assumption that the 8085 program does not use the SID and SOD instructions. However, the same cannot be said for a program written for the Z80. This implies that, excluding the SID and SOD in.tructions, the 8085 instruction set is a subset of the Z80 instruction set. Below, we shall introduce the additional 280 instructions that are not available in the 8085. 3.8.1 Organization of the Z80 Fig. 3.12 shows the busses available in the Z80. It has a 16-bit address bus and an 8-bit data bus. Note that the busses are non-multiplexed. This is unlike the 8085, in which the address and data busses are multiplexed. Like the 8085, the Z80 supports both memory mapped and I/O mapped 1/0. When the address placed on the address bus is for the memory, the MREQ signal is asserted and IORQ is negated. On the other hand, when the address placed on the address bus is for an I/O device, the IORQ signal is asserted and MREQ is negated. ‘The register architecture of the 280 is more innovative than that of the 8085. Fig. 3.13 exhibits the complete register set of 280. 112. PROGRAMMING A MICROPROCESSOR p> ArAis CLK K— Y rd, Other <)> ‘MREQ control _ signals TOR Figure 3.12: Busses in 280. Special purpose regsters Index Register IX Index Register IY Stack Pointer SP Program Counter PC Main register set Alternate register set Figure 3.13: Register set of 280. ‘The register set is divided into general purpose registers and special purpose registers. The general purpose set is further subdivided into two sets of registers: the main register set and the alternate register set. These two sets are identical. Each set consists of eight 8-bit registers. These are the accumulator A, the flag register F, and six more registers denoted by B, C, D, E, and F. ‘The registers in the alternate register set are distinguished from those in the main register set by priming them. For example, A (to be read as A prime) is the accumulator in the alternate register set, and D is the corresponding ‘alternate set register for D in the main register set. THE ZILOG 280 9-113 Of the main and alternate register sets, only one can be in use at any time. A simple instruction can be used to exchange the contents of the entire main set with that of the alternate set. As in the 8085, B-C, D-E, and H-L can be used as three 16-bit register pairs. The special purpose register set consists of four 16-bit registers and two 8-bit registers. The program counter PC, and the stack pointer SP, have the same function as in the 8085. The two 16-bit indez registers, denoted by IX and IY, are used for providing useful addressing modes to be introduced below. The functions of the interrupt vector register. and the memory refresh register R, are introduced in Chapter 6. The advantages of having a main register set and an alternate register set are also brought out in Chapter 6 We shall now introduce the addressing tnodes available in Z80. Ap- pendix B lists all Z80 instructions and their bitary equivalents. 3.8.2 Z80 Addressing Modes Immediate, Immediate Extended and Register The immediate and immediate ertended addressing modes can be used to load an 8- or a 16-bit constant, respectively, into a register or a memory’ location. For example, the following two instructions load the constants OFEH and 45EAH into registers B and register pair H-L, respectively. LD B, OFEH ; Load B with OFEH. LD H,45EAH =; Load RE with 45EAH The above instructions use the register addressing mode for the destination operand. The first of the above instructions uses the immediate addressing mode for the sourcé operand. The second instruction uses the immedi- ate extended mode for the source operand. Note. that we are writing the addresses in their normal order, not in reversed order. The LD mnemonic is used in Z80 for all data movement instructions that enable the movement of data from a source location to a destination location. Thus, in terms of its function, LD is similar to the MOV and MVI mnemonics of the 8085. Modified Page 0 Addressing This addressing mode is used for a special single-byte call instruction which has a mnemonic RST (RST is an abbreviation for restart). To understand how it works, consider the instruction, RST 5 ‘The above instruction will force the 280 to execute a subroutine call to location 5 x 8 = 40. As this is only a single byte instruction, the maximum 114. PROGRAMMING A MICROPROCESSOR [—™ 0000 0008 0010 Pageo| 0018 0020 0028 0030 LL 0038 PST? Part of the subroutine code could be here. Thera can OFFFF be a jump to the remainder of the code. Figure 3.14: Page 0 locations in 280. value of the operand of RST is 7. Thus, an RST instruction can be used to call a subroutine located at any one of the eight possible locations in the memory starting at 0000H to 0038H. As shown in Fig. 3.14, each of the eight possible RST instructions corresponds to a segment of eight memory locations that may contain a subroutine or a part of a subroutine. This 64-byte area of memory is also known page 0, and hence the name modified page 0 addressing mode. The advantage of using the RST instruction to call a subroutine is that some frequently used subroutines can be called just by a single-byte in- struction instead of a regular 3-byte CALL instruction. Relative Addressing This mode permits using only an 8-bit offest within an instruction to specify a complete 16-bit memory address. It is used only in branch instructions. Consider the following example: LOOP: DEC B j Decrement A. IR NZ, LOOP-$ ; Jump to loop if the 3Z flag is not set, else continue. The first instruction above decrements register A. The second instruction checks the % flag. If the Z flag is not set (=1), implying that the previous instruction caused register A to decrement to 0, then the jump instruction does not branch to LOOP. If Z=0, then the branch takes place. THE ZILOG 280 115 Now suppose that the DEC instruction is located at the memory loca- tion 0050H. As DEC is a i-byte instruction, the JR instruction is located at the memory locations 0051H and 0052H. Note that JR is a 2-byte in- struction. The first byte of the JR instruction contains the binary opcode for JR. What does the second byte contain? In this example, the second byte will contain the value (-3). This 8-bit value is known as the displacement. The assembler computes the value of the displacement as follows: displacement = address of LOOP — address of the byte following the JR instruction ‘The $ symbol in JR stands for address of the byte immediately following the current instruction. In the above example, the value of $ is 0503H. Thus, the displacement is computed by the assembler as: 0500H — 0503H = —3 Assume that the Z flag is not set. Now, when JR is executed, the new value of PC will be computed as PC — PC +(—3). When the execution of JR begins, the value of PC is 0051H. But when the Z80 has decided that it has to branch, PC is incremented by 2, making it 0053H. Thus, the displacement of (-3) will be added to 0053H. The new value of PC is 0050H ! This forces the Z80 to resume execution from the DEC instruction. The address of the target instruction is computed by adding the dis- placement to PC. Therefore, this addressing mode is called the relative addressing mode. What has been achieved by this complex looking addressing mode? We can cite the following advantages: 1. Instead of using the 3-byte jump instruction, as would have to be done in the 8085, only a 2-byte jump instruction is needed. This results in reduced code size. 2. ‘The 2-byte jump instruction will execute faster than the 3-byte jump instruction 0s will be explained in Chapter 5. Can all the jumps in the program be just 2-bytes long? Obviously not. Note that the displacement is stored in just 1 byte. Thus, the maximum displacement is +127 and the minimum displacement is -128. (In case you have forgotten how to compute the maximum and minimum representable number in a byte, refer back to Chapter 2.) This implies that the relative addressing mode can be used only when the target instruction of the jump is within +129 and -126 bytes of the jump opcode address. Thus, relatively short jumps can be coded in 2 bytes. All other jumps will still need to be coded as 3-byte jumps. 116 PROGRAMMING A MICROPROCESSOR. Extended Addressing This is the same as the direct addressing we learned earlier for the 8085. It permits using a full 16-bit address for the operand. Here are few instructions that use extended addressing. JP TARGET —; Jump to the instruction jlocated at TARGET. ; Transfer the contents of register to : memory location DATA. Note that (DATA) refers to the ;contents of location DATA. Just writing DATA ; would mean transfer contents of register A jto the address DATA. This is obviously jnot possible. The uP can transfer a value to a location, jnot to the address of that location! LD (DATA), A Indexed Addressing An instruction using the indexed addressing mode contains an 8-bit dis- placement, The effective address of the operand is computed by adding the displacement to the index register IX, or IY, specified in the instruction. Consider the following instruction: LD A, (IX+10H) ; Load register A with the contents of ;the memory location (IX+10H) Assume that the index register [X contains 0055H just before the above instruction is executed. The effective address of the source operand will then be 0055H+10H=0065H. Thus, the above LD instruction will load the accumulator with the contents of the memory location 0065H. The contents of the index register remain unchanged as a result of executing the above LD instruction. ‘The index registers can be incremented or decremented by 1 using the INC and DEC instructions. The ADD and SUB instructions can be used for adding or subtracting any number from the range 0 to 3 to or from the index registers. Register Indirect Addressing This is the same addressing mode as we learned earlier for the 8085. Here, the effective address of the operand is specified in a register pair. Consider the following examples: LD B, (HL) jLoad register B with contents of memory jlocation pointed at by register pair HL. INC (HL) ; Add 1 to the contents of memory location THE ZILOG 280 «(117 spointed to by HL. LD (BC), A; Save contents of A in memory location :pointed to by BC. Note that not all combinations of source and destination operands are pos- sible. For example, there are no instructions such as: LD (BC), D ; Invalid in 280. The correct way to do this LD A,D ould be to get D into A. LD (BC), A ; Now save A to memory location (BC). Register indirect addressing is similar to indexed addressing. In indexed addressing, a displacement can also be specified, which cannot be specified when using register indirect addressing. Bit Addressing Certain instructions in the Z80 permit direct access to any bit of any register or any memory location. These instructions use the bit addressing mode. Consider the following examples: BIT 3,€ ; Set the Z flag to the ; complement of bit 3 of register C. BIT 5, (HL) et. the Z flag to the complement of bit 5 j of memory location(HL). SET 2,D jSet bit 2 of register D to 1. RES 6, (IX+100) ; Reset to 0, bit 6 of memory location 3 (IX+100) We shall now present some programming examples to illustrate the use of 280 instructions. Example 3.20 This is a simple sorting subroutine. It uses the bubble sort method for sorting an array of 8-bit numbers into descending order. We assume that this subroutine is called with the following parameters: © address of the first element of the array to be sorted in HL pair, * number of elements in the array in register C. ‘The program with explanatory comments is given below. For those who do not understand the bubble sort technique, these explanatory comments should prove to be sufficient. Program 3.8 ; This subroutine sorts an array of 8-bit integers 118 PROGRAMMING A MICROPROCESSOR jimto descending order. The address of the first element jof the array is available in HL pair. The total jnumber of elements to be sorted is in register C. ; Sorting is done in-place which means that jno extra memory is used for sorting. ARAD: DEFS 2 ;Reserve a word for the jaddress of the first array element. FLAGBITEQU 0 ; This is the bit number jof the exchange flag. SORT: LD (ARAD), HL ;Store array address in memory. jThe main loop begins here. One execution jof this loop corresponds to one pass ;through the entire array. ORA 6,0 Logically OR 0 with C. RET 2 ;Retum if Cis 0 which implies that jthere are no elements in the array! Pass: LD IX, (ARAD) ;IX now points to first array element, RES FLAGBIT, E ; Reset bit 0 of register E. This indicates that jno elements of the array have been exchanged LD B,O jtill now. Get the array length into register B. ; We do so because C will be needed in the next pasa ;through the array. ;B now indicates the number of array elements jremaining to be scanned in this pass. DEC B ;B+—B-1. Decrement B. JR Z, OVER-$_—_; If B was 1, only one element in the array, then jsorting is over! ; Now compare the next two elements of the array at (IX) and (IX+1). jlf these are out of place, then exchange them. NXTCMP:LD A, (IX) ;Get the array element pointed at by IX. LD D,A ;Save it in D for later comparison. LD E, (IX+1) _; Get the element immediately following the previous one. SUB OE ; Compare the two elements. IR NC, NOEX _; If the previous one jis greater than the next one, then ;do not exchange them. ; The elements are out of place, so exchange them. LD (x), E ; Send second element ;to the first one’s position. LD (IX41),D Send the first one to ; the second one’s position. THE ZILOG 280 «119 SET — FLAGBIT, H ; Set exchange bit of H to indicate that jthere has been an exchange in this pass. NOEX: INC Ix ;Point to the next element of the array. DJNZ NXTCMP — ; Decrement B, if > 0 ;then go compare next two elements. jOne pass over. Check if any exchange occurred. If it did, ;then another pass is needed, otherwise sorting is over. BIT FLAGBIT, H ; Test the FLAGBIT of H. JR NZ, PASS-$ ;Ifnot 0 then go for another pass. RET ; Otherwise return to calling program. ; Sorting over! a The Z80 has some powerful block move instructions. These instructions can be used for moving a block of data from one place in memory to another. The LDI, LDIR, LDD, LDDR, CPI, CPIR, CPD, and CPDR are the block move and compare instructions. Each of these instructions assumes that the address of the source operand is in HL and that of the destination operand is in DE. Register pair BC is used as a byte counter, The next example illustrates the use of these instructions. Example 3.21 Weare given two strings of characters. Let us denote them by SOURCE and DEST. Both of them have N characters each. We wish to compare SOURCE with DEST starting from the first character in each string and moving to the next character in each string. When a character in SOURCE differs from the corresponding character in DEST, the comparison should stop. Any remaining characters in SOURCE should be transferred to another string T. For example, suppose that SOURCE and DEST are: SOURCE=abcdefgh DEST=abepqrra N=8 ‘Then, we must get T=cdefgh. Program 3.9 performs this task. Program 3.9 ; This routine compares successive characters in jstring SOURCE with those in string DEST. As soon as ja mismatch is found, all the remaining characters in SOURCE jare transferred to string T. ; When this routine is called, we have: 120 PROGRAMMING A MICROPROCESSOR é HL: Points to first element of SOURCE. ; DE: Points to first element of DEST. i BC: Points to first element of T. ; A: Contains the value of N<128 (string length). TADDR: DEFS 2 j Temporary storage ;for address of string T. STRINGS: LD (TADDR),BC; Save address of T in memory LD BO jSet B to 0. LD GA ;Now BC contains the string length. LOOP: LD A.C ;Get remaining string length in A. OR 0 ;Logically OR it with 0 to set zero flag. RET Z ;If no more remaining characters, jthen return to caller. LD A, (DE) jGet next character of jstring DEST in accumulator. INC DE ;Make DE point to next element in DEST. CPI ;Compare next character of SOURCE ;with character in A and make HL point jto the next character in SOURCE. JR NZ, LOOP-$ ;Go back to compare more if not same. DEC HL ; Comparison over. Move HL to the character jthat did not match. INC c ;CPI has decremented BC, reset it. LD DE, (TADDR) Get address of string T into DE. LDIR ; Move remaining elements of SOURCE to T. ;Note that LDIR automatically moves jelements of SOURCE to T. The number of jremaining elements is given by BC. RET In the above program, CPI (a) compares the contents of register A with those of (HL), (b) increments HL so that it points to the next element of SOURCE, and (c) decrements the character count held in BC. Note that as N<128, register B will always be 0. Only C will contain the count (actually these are the least significant 8 bits of the count. ‘The LDIR instruction (a) moves (HL) to (DE), (b) increments HL and DE so that they point to the next locations in SOURCE and T strings, respectively, and (c) decrements the count in BC. If the count is not 0, these three steps are repeated. Note that if the count is 0 before the execu- tion of LDIR, then the LDIR instruction will move a total of 65536 bytes from SOURCE to T! Obviously, this would be a programming error. It is therefore wise to modify the above program so that if BC is 0, the LDIR instruction is not executed. o SUMMARY 121 3.8.3 Input and Output Instructions The Z80 provides the IN and OUT instructions for data transfer between the CPU and the I/O devices. However, these instructions are much more powerful than the ones available in the 8085. A few examples of I/O in- structions in the Z80 are given below. IN A,4 ;Get a byte from port 4 jinto the accumulator. j The destination in this-case joan only be register A. IN D, (C) ;Get a byte from port ; whose ‘address is in register C. ; Any of the seven general purpose jFegisters can act as the destination. INI ; Transfer a byte from port jnumber (C) to memory location (HL). j Then decrement the byte count jin B and increment HL. ; This is useful in block ;data transfer from an input device. OUT = 123, A ; Transfer contents of A to port 123. OUT = (C),B ; Transfer contents of B to port (C). Note that the port number can be specified as a constant within the IN or QUT instruction, or it can be a variable in register C. This feature is very useful when data is to be obtained from successive port numbers. Problems like the one given in Exercise 3.30 can be programmed very easily with this feature. 3.8.4 Subroutine Calls in Z80 The subroutine calling mechanism in the Z80 works like in tue 8085. The CALL instruction causes the program counter to be pushed onto the stack and the program counter loaded with the address of the called subroutine. Just as with RET, all possible conditions listed in Appendix B can be specified in CALL. For example, CALL Z, SUB will branch to SUB only if the Z flag is set. 3.9 SUMMARY In this chapter, certain aspects of the organization and programming of microprocessors and microcomputers have been introduced. We have used the 8085 and Z80 to illustrate various concepts. Though the programming concepts introduced in this chapter are quite general in nature, much more needs to be explained about the architecture of microprocessors. However, it is our contention that a thorough grasp of the concepts and techniques introduced in this chapter should aid the study of subsequent chapters. 122 PROGRAMMING A MICROPROCESSOR Table 3.4: I/O Addresses EXERCISES 3.1 What in your opinion are the features of a microcomputer that dis- tinguish it from a general purpose computer (mini, midi or maxi)? 3.2 In a particular microcomputer system, all 1/O devices are treated as memory locations. There are two input and two output devices connected to the microprocessor, The designer of this system has used the least significant bit (LSB) of the address together with the read (RD) and write (WR) lines of the microprocessor to select a particular I/O device. Table 3.4 summarizes the combinations used. In Table 3.4, 11 and [2 designate the two input devices and O1 and O2 designate the two output. devices. What do you think is the flaw with the above design? Can you sug- gest an alternative scheme to address the I/O devices in the memory mapped mode? Note that RD and WE denote two control lines. RD. is active (low) when a memory read operation is to be performed, WR is active (low) when a memory write operation is to be performed? 3.3 Why is it that in the I/O mapped mode, only 256 input and 256 output devices can be addressed? 3. ~ What feature should be available in a microprocessor so that a micro- computer designer can use both memory mapped and I/O mapped 1/0? In the 8085, why are 2-bit codes are used for specifying register pairs and 3-bit codes for specifying individual registers? 3.6 What is the difference between the SP and other register pairs? 3.7 Write an assembly program for the 8085 for evaluating the expression 3. mn (X/2#Y/4) and displaying the result. Assume that X and Y are single-byte two’s complement integers. 3.8 Write an assembly program for the 8085 that finds the two’s com- plement of a given 8-bit integer and displays the result. Code your program into an MLP. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 124 3.17 3.18 3.19 3.20 3.22 PROGRAMMING A MICROPROCESSOR High Order Byte Low order byte Figure 3.15: 16-bit two’s complement integer format. ‘Three integers, each 2-bytes in length are given. Each integer is in the two’s complement form with the higher order byte (lower memory addres) containing the seven most significant bits and the lower order byte (higher memory address) containing the eight least significant bits, The leftmost bit of the higher order byte is the sign bit (see Fig. 3.15). Develop an assembly program for the 8085 that finds the maximum of these three integers and stores it in the register pair B-C. Rewrite Program 3.2 if all the > operators in Algorithm 3.2 are te- placed by >, which of the two programs do you think is more efficient (say, in terms of the total number of instructions)? (a) Why is it necessary to have an END directive follow the- last instruction of an assembly program? (b) Can there be more than one HLT in an assembly program? (c) How does the microprocessor know which is the first instruction to be executed in a program ? A sequence of 100 bytes has been reserved by the following pseudo instruction DATA : DS 100 Assuming that each of these bytes contains an integer, write an as- sembly program that finds out and displays the maximum out of all the 100 integers. (a) Analogous to the subroutine that was written for multiplication (see Program 3.5, write a subroutine that divides a single-byte in- teger X by another single-byte integer Y (assume X,Y > 0). Your subroutine should leave the remainder and quotient in registers B and C respectively. (b) Write an assembly program to evaluate the expression (P+Q+R/S) where P, Q, R, and $ denote single-byte integers. Use the multiply and divide subroutines for this purpose. In Program 3.3, the intermediate values of PROD and X are held in registers inside the microprocessor and not in memory locations. Why is that so? 3.23 3.24 3.25 3.26 3.28 SUMMARY 125 x=64] y=] (a) cum = (0) (b) Figure 3.16: BCD addition (a) Numbers to be added (b) Sum of the numbers. Design a subroutine for the 8085 that multiplies two single-byte inte- gers X and Y, and produces a 16-bit product. X and Y need not be natural numbers. Compute the number of states required by Program 3.3 when contents of the ACC are multiplied by 10, i.e. Y=10 and X denotes any other single byte integer. Compare the number of states you have computed with the number of states required by the two different instruction sequences given in Example 3.11. Which approach is better? ‘Two decimal numbers, N digits each, are stored in BCD packed for- mat. Each number occupies a sequence of bytes in the memory. Write an assembly program that adds these two numbers and produces the (N+j) digit sum (j = 0 or 1) stored in the same format. For example, if the two numbers are given as shown in Fig. 3.16(a), then the sum should be as shown in Fig 3.16(b). In Exercise 3.25 assume that the given numbers are of N and M digits respectively. Now write an assembly program to add these and produce a (K-+j) digit sum where K = maximum (N,M) and j = 0 or 1. Solve Exercise 3.26 assuming that the numbers are stored in unpacked form, i.e. one digit per byte. ‘An assembly program has the structure shown below. Assuming that the addresses corresponding to FIRST, SECOND, and MUL are OAAIH, OAFOH and 0B20H respectively, show what would be the stack contents before and after each CALL and each RET statement. Assume that calling begins when the main program calls subroutine FIRST. Also assume that there is.no other CALL in the program. The addresses of instructions immediately following the three CALL statements for SECOND, MUL, and FIRST are 0ABOH, OAF9H, and OBO2H, respectively. ; Subroutines 126 PROGRAMMING A MICROPROCESSOR FIRST: CALL SECOND RET CALL MUL RET MUL: RET ; Main Program START: CALL FIRST HLT END Figure 3.17 3.29 A recursive subroutine is one that may call itself one or more times for the purpose of performing the desired task. (Those familiar with the notion of a function in programming languages may note that we are using the term subroutine to refer to both subroutines and functions known to FORTRAN users) To illustrate, let us consider the following definition of the factorial operation: FACI(N) = N*FACT (N-1); N>1 1,N=1 In this definition, assuming that FACT is a subroutine that takes an integer N as input and produces its factorial as the output, we see that the computation of FACT involves the computation of FACT of (N-1). 3.31 3.32 SUMMARY 127 Write a recursive subroutine named FACT in the 8085 assembly lan- guage, that uses the above mentioned formula to compute the facto- rial. (Hini: Before FACT calls itself, the current value of its input parameter must also be pushed onto the stack along with the return address). A microprocessor is monitoring the temperatures of N similar pro- cesses. The value of each temperature variable is converted into 8 bits and stored in one of the N memory locations reserved for each variable. The microprocessor samples these values after every ¢ time units and checks which of these have exceeded a present. value (say ‘TMAX). If, say, the K™ value exceeds TMAX an output signal is sent to an alarm bell having K as the address. This signal is sent by exe- cuting the OUT K instruction in the 8085 or OUT (C), A instruction in the 280, after storing (OF F)16 in the accumulator. Write an assembly program for the 8085 and for the 2.80 that. could be used to perform the task of checking each value and sending an alarm signal if necessary. What problems are likely to arise with the 8085 program? If the CALL and RET instructions were not provided in the 8085 could it be possible to write subroutines for this microprocessor? If so, how? Modify the program given in Example 3.21, so that if all characters in SOURCE and DEST are same, the LDIR instruction is not executed. FURTHER READINGS Intel 8085 Intel Corp., “8080/8085 Assembly language programming Manual,” For all the details of the 8085 machine assembly language programming, this is best concise source. Zilog Z80 Zilog Inc., “Z80-CPU 2Z80A-CPU Technical Manual,” September 1978. ‘This is a concise source for programming and timing details of the z80. Chapter 4 SEMICONDUCTOR MEMORIES 41 INTRODUCTION Semiconductor memories are used for storing information in microprocessor- based systems. This chapter presents the organizational and timing char- acteristics of such memories. To understand the place of semiconductor memories in a system, let us examine Fig. 4.1 which exhibits the hierar- chical structure of memory. Though a total of six levels are shown in this figure, in many systems there may be only three levels of memory, namely, registers, primary memory, and mass storage. From the point of view of speed of operation, the memory types listed towards the top of the pyramid are faster than those listed towards the lower end. For example, the time required to read a word of information from a fast primary memory may be as low as 100ns as compared to the 10 ms required to read a word from a hard disk. Below, we first describe the three most often used memory levels fol- lowed by a description of the remaining levels. Registers At the highest level shown in Fig, 4.1, we have the registers inside the CPU itself. As we have seen by now, the number and type of these registers vary from one pP to another. Generally, however, the amount of register storage is limited from a few hundred bits to a few thousand bits. For example, in the 8085, the register memory accessible to a programmer is 101 bits, In addition to these programmer accessible registers, the 8085 has a few more registers as described in Chapter 5. Primary Memory Registers require space on the yP chip and hence only a limited number of them can be provided. This number is generally not sufficient in most systems. Thus, for program and data storage, the primary memory level is used; As shown in Fig 4.1, this is the third level from the top. The primary memory consists of several chips. The total size of primary memory may 128 INTRODUCTION 129 EX Figure 4.1: Memory hierarchy in a pP based system. vary from a few kilo-bytes in small systems to several mega-bytes in large systems. : In some systems, there is an extra level of primary memory. This extra level consists of e memory of a much larger size, though an order of magni- tude slower in speed than the higher level primary memory. However, this extra level is generally found in supercomputers such as the Cray X/MP series machines. In such machines, this extra level of primary memory is known as the solid state device or simply SSD. Mass Storage In most systems, we need several programs and data to be resident within the system so that they can be loaded for execution into the primary mem- ory without much delay. Examples of these programs include compilers, assemblers, text editors, and other utility programs. Examples of such data include payroll data of a company, data generated from the simulation of an automobile accident, and data on the road map of a city. Holding these programs and data may require several megabytes of memory. Further, these programs and data may not be accessed very fre- quently. Thus, one or more mass storage devices is used for storing this information. Hard disks, floppy disks, and optical disks are some of the de- vices used for mass storage. The total amount of mass storage in a system 130 SEMICONDUCTOR MEMORIES may vary from as low as a few mega-bytes to as high as several giga-bytes. ‘This kind of mass storage is also called on-line storage as all the information is accessible to the uP, though at a comparatively slower rate. Cache In certain systems, the primary memory may be much slower than the CPU for cost or other reasons. This would imply that the CPU wait for the primary memory. to send or receive data. This wait time eventually results in decreased performance of the CPU. In order to avoid the CPU operating at lower than its rated speed, some designers use cache memory at the second level. The cache is generally of the same type as the primary memory, though much faster. However, as the cache consists of faster memory chips, it is expensive too. Cost therefore becomes one of the size limiting factors of cache. Cache memory size is typically in the few kilobytes range. In some advanced j:Ps, for example the Motorola 68030, the cache is located on one of the CPU chips itself }. In other systems, it consists of fast memory chips sharing the same board as the uP and the primary memory. Off-line Backup What does one do when all the mass storage available in a system gets used up? There are several options available to the user. One is to buy more mass storage. This should work until the problem crops up once again! There is a limit to which a user or a designer may like to invest in on-line mass storage. At some point, it may be cost effective to have a removable storage device in the system such as a removable hard disk or a tape drive. Once such a device is available, the user can perform periodic backup operations. A backup operation removes some of the very infrequently used data and programs and saves them on a backup tape or a hard disk cartridge. These tapes or cartridges can then be archived into a library only to be retrieved when necessary or may be never! Such storage is what we term as off-line backup. It is off-line because the information on these tapes and cartridges cannot be accessed by the jP just as easily as it would access information from the primary or other on-line memory. 4.1.1 Memory Types ‘As mentioned earlier, the primary and cache memories consist of many chips. Depending on the type of access allowed to the information stored in a chip, these chips have been categorized as shown in Fig. 4.2. TMany Ps consist of more than one chip. Motorcla’s 88000 and Integraph's Clipper are two examples of such chips. INTRODUCTION 131 [SeMi-CONOUGTOR MEMORY aD LorAY READIWRITE READ ONLY | [WRITE OnLy] [“ERASABLE ‘STATIC DYNAMIC (ROM) (PROM) (6264) 62302) (54286) Ww ELECTRICALLY SINGLE PORT | [MULTI-POAT oA ERASABLE TERASABLE (511000) VIDEO RAM PROM (53482) (27512) EEPROM NVRAM ‘58665, (2104) Figure 4.2: Semiconductor memory types (numbers in parentheses denote sample chips). Semiconductor memories with which we are concerned in this book, con- sist of several cells for information storage. Each cell is used for recording one bit of information. The time to' access any cell on a chip is the same. For example, if a memory chip has 1,048,576 cells, then the time to access cell 1 will be the same as the time to access cell 1,048,576. Such access- ing is known as random access. Information stored on magnetic tapes, for example, cannot be accessed randomly. It is accessed sequentially. In this chapter we are concerned only with random access memories. However, the term random access memory, abbreviated as RAM, is now widely used to refer to special types of semiconductor memories, namely. those that can be used for reading and writing information into its cells. Throughout this book, the abbreviation RAM is used in the same sense as used in the industry. 132 SEMICONDUCTOR MEMORIES Static and Dynamic RAMs There are two types of RAMs used in a pP-based system. These are: static RAM (also known as SRAM) and dynamic RAM (also known as DRAM). A static RAM chip is characterized by the fact that once a bit of information is written into a cell, the cell retains this information until it is overwritten or electrical power is taken off the chip. The cell itself is a flip-flop and may consist of four to six transistors. In this chapter we shall examine some of the widely used static RAMs, namely the Toshiba 2016 which is pin compatible with Hitachi 6116 and Toshiba 2063 which is pin compatible with to Hitachi 6264. A dynamic RAM chip has a much smaller cell than a static RAM. One bit of information is stored as the charge on a capacitor. Typically, a dynamic RAM can store about four times as much information as a static RAM in the same area due to the smaller cell structure. This also leads to lower cost per bit for dynamic RAMs. However, because the information is stored as charge on a capacitor, the dynamic RAM requires refresh once every few milliseconds in order to retain the stored information. This refreshing needs extra circuitry and makes the interfacing of dynamic RAMs to 4Ps more complex than the interfacing of static RAMs. Generally, systems that require large memory capacity, use dynamic RAMs to lower the memory cost. Most personal computers, for example, use dynamic RAMs as primary memories. Static RAMs are used where speed of operation is of prime concern, and either the memory size is not too large, or cost is not the prime criteria during system design. In this chapter, we shall learn about the Toshiba 41256 (equivalent to Hitachi 50256) and Toshiba/Hitachi 511000 dynamic RAMs. ROMs and Their Variations A ROM, an abbreviation for Read Only Memory, is a preprogrammed chip and can only be read by the uP. Thus, once the information has been recorded in the ROM, generally by the manufacturer, the chip can either be used with whatever it contains or has to be discarded. A user programmable ROM, also known as PROM, or one time pro- grammable ROM, can be programmed by the user just once. After being programmed, the PROM behaves just like the ROM. Both the ROM and ° the PROM consist of some kind of fuse in each cell. Blowing off or retaining the fuse decides whether the cell contains a 1 or a 0. A PROM that can be erased by ultraviolet light and then reprogrammed, is known as an EPROM. In order to erase the EPROM, it has to be taken out of its normal circuit and placed in front of a special ultraviolet eraser for a several minutes. The typical erasure time varies from 15 min to 60 min. It depends on the product of the ultraviolet light intensity and the time to which the chip is exposed to this intensity. For example, the Intel 27256 CHARACTERISTICS OF MEMORIES = 133 EPROM can be erased in about 15-20 min, if subjected to an integrated dose? of 15 watt-s/cm?. The EPROM can be programmed with the desired information using an EPROM programming device. We shall learn about the Intel 27256 EPROM in this chapter. The inconvenience and other technical problems associated with the re- moval of the EPROM from its normal circuit of operation, are taken care of in Electrically Erasable and Programmable PROMs, more popularly known as EEPROMs. The EEPROM can be erased, and programmed, while under normal operation. This makes the EEPROM ideal for applications where some parametric data needs to change over a period of time, perhaps as the system that is being controlled by the jsP-based controller ages. Typical erasure times vary between 1-10 8. In this chapter we are concerned with semiconductor memories that comprise the cache or the primary memory in a pP-based system. In the remainder of this chapter, we shall describe the organization and timing characteristics of these memories. The knowledge gained by reading this chapter should be useful in selecting and designing memory subsystems. Interfacing of memory chips to Ps is described in Chapter 6. 4.2 CHARACTERISTICS OF MEMORIES When selecting a memory chip for use in a 4P-based system, a designer normally has a wide selection from which to choose. Following is a list of features that are commonly examined when selecting a specific memory chip: © Capacity and organization , * Timing characteristics, also known as AC characteristics, © Power consumption and bus loading, also known as DC characteris- tics, Physical dimensions and packaging, © Cost, © Reliability, and e Availability. In this chapter, we are concerned mainly with the first three of the above ‘items. Though the above mentioned attributes depend on memory types, there are some general concepts and terminology that is common to almost all memory types and well accepted amongst different manufacturers. In this section, we present these concepts and terminology. 2The integrated dose is defined as (Ultraviolet, light intensity x exposure time in seconds). 134 SEMICONDUCTOR MEMORIES Table 4.1: Organization of Some Memory Chips Manufacturer | Chip Memory Type | Organization | Hitachi HM6II6 Static RAM 2Kx8 [Hitachi | HMozea Btatic RAM BKxs | Hitachi __| HMo287__| Static RAM CiKx1 #PD43286 | Static RAM 32Kx8 256K x1 Toshiba TOSi100 Hitachi HM53462 Multi Port Dynamic RAM EPROM 64Kx4 32Kx8 128K x8 512K x8 | Tntel 127256 Toshiba TC571000 [[Toshiba TCS534000 4.2.1 Memory Chip Capacity and Organization The capacity of a memory chip is generally measured in terms of the number of bits or bytes it contains. As most chips contain more than a few thousand bits, the memory capacity is expressed in kilobits (Kb) or mega-bits (Mb). Note that the 6 in these units denotes bits, not bytes. Many advertisers are more careful and use the notation K bits or M bits to avoid any confusion. Besides the capacity, one has to consider the word size of the chip. A word, as we already know, consists of one or more bits. The word size of a chip determines how many bits can be accessed by the yP in a single access to the chip. A commonly used notation, that indicates both the word size and the capacity of the chip, is N x s. Here, N is the total number of words and s is the total number of bits per word. The capacity of the memory is obviously N xs bits. Table 4.1 lists the organization of some widely used memory chips. A memory chip having s bits per word, is also referred to as an 5 bits wide memory. Each word inside a chip has a unique address. For example, the HM6264 static RAM has a total of 8096 words. The word addresses for this chip range from 0 to 8095. Note that each address corresponds to a word which consists of 8 bits for the HM6264. Thus, when a «P sends an address, say 5, to the HM6264 chip, and requests a read operation, the chip will send the contents of the word at address 5 to the pP. As another example, the TC511000 1M bit dynamic RAM has a total of 1,048,576 words of 1 bit each. The addresses of these words range from 0 to 1,048,575. Generally, the addresses as’ recognized by the P, are different from the addresses as recognized by the individual chip. In the next section and later in Chapter 6, this concept will be elaborated further. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 136 SEMICONDUCTOR MEMORIES mines the word size of the memory subsystem. The 8088, for example, is an 8/16-bit pP. Internally it has 16-bit registers and can perform operations on 16-bit data. Externally, however, it only has an 8-bit data bus. Thus, a memory subsystem designed for an 8088-based system, will be 8-bits wide. The 386SX is a 16/32-bit uP. Internally it has 32-bit registers and can operate on 32-bit data. Externally, it has a data bus 16-bits wide. Combining the chips together A chip can receive or send exactly s bits of data in one read or write operation if the chip is s bits wide. If the word size of a chip is smaller than the word size of the memory subsystem, then more than one chip will contribute to the word accessed by the uP. The next few examples illustrate this point. Example 4.1 ‘We would like to have a 16K word memory subsystem that is 8-bits wide. If we construct this subsystem using the HM6264 8K x8 static RAM, we need two HM6264 chips. Fig. 4.4 shows how the two chips can be connected to the eight data bus lines Do-D7. Later in Chapter 6, we shall describe the address decoding techniques to design this memory subsystem so that one half of the 16 K'addresses fall into chip #0 and the other half into chip #1. a Example 4.2 We would like to have a 1M word memory system, each word being 32 bits wide. Let us assume that we need to construct this subsystem using the ‘TC511000 1M x 1 bit dynamic RAM chips. How many chips do we need? As each TC511000 can contribute 1 bit to the 32-bit word, we need 32 of these to make up 1M words. Fig. 4.5 shows how these chips would contribute to one accessed word. In this design, one bit of every word that is accessed by the pP, comes from one chip. Bit 0 of the word always comes from chip #0, bit 1 from chip #1, and so on. Thus, one bit of each one of the 1M words of the memory system, resides in one memory chip. o Dynamic or static ? One might ask: When designing the main memory subsystem, how do I decide when to use a bit wide against a nibble or byte wide RAM? One begins by deciding whether to select a static or dynamic RAM. Static RAMs CHARACTERISTICS OF MEMORIES 137 Figure 4.4: A 16K byte subsystem using HM6264 static RAM chips. 511000 chp #0 Bout Pout “511000 Chip #1 511000 Chip #30 Bout 511000 Chip #34 Figure 4.5: A 1M word subsystem using TC511000 dynamic RAM chips. 138 SEMICONDUCTOR MEMORIES are preferred when the memory size is not too large. On the other hand, dynamic RAMs are preferred when the memory system is going to be quite large. As an example, if a system with an 8K byte RAM is needed, the static RAM would be a good choice. In case an 8M byte subsystem is to be designed, and cost is an important issue, the dynamic RAM would be an obvious choice. Once the static/dynamic RAM decision has been taken, one considers which chip within the static or dynamic RAM category is to be selected. Here, one can use the best fit rule. For example, if a 256K byte subsystem is desired using dynamic RAMs, we would obviously not use the 1M x1 chips. Instead, the 256K x 4 or 256 K x 1 chips would be better. On the other hand, ifa 1M byte subsystem is desired, we would select the 1M x 1 chip instead of the 256 K x 4! 4.2.2 Electrical Signals In this section, we describe the input and output signals common to most memory chips. Signals that are typical of a class of memories, are described in the next section. Fig 4.6 shows different signal categories found in memory chips. ‘To understand these signals, we mention that the two most common operations performed with memory chips are the read and the write operations. A read operation consists of the the following sub-operations: 1. Select the chip using the chip enable/disable controls. 2. Place the address of the word to be read on the address input lines of the chip. 3. Set the appropriate line in the read and write controls, to indicate to the chip that this is a read operation. 4, After some time delay, during which the chip actually reads the in- formation from the addressed word, the data is available on the data output lines. The above sequence of actions must take place in accordance with certain memory-dependent timing constraints. These timing constraints will be described in detail in the remainder of this chapter. ‘The sequence of actions for a write operation is similar to the one de- scribed above for a read operation. The read and write controls should be set to indicate that it is a write operation. ‘The data must be placed on the data input lines by the xP or by any other device that desires to perform the write operation‘. * The words large and small are used in a relative sense here and their exact meaning will depend on the state of memory technology at the time the system is being designed. 4A read or write operation can be carried out by a device other than a uP. We shall explain one such instance in Chapter 6. STATIC RAMS 139 M ‘ADDRESS E DATA M READ zg OUTPUT DISABLE WATE ———| Y c POWER SUPPLY CHIP: H ENABLE/ } 1 SELECT P OTHER CONTROL INPUTS/QUTPUTS: Figure 4.6: Typical signals in a memory chip. The set of lines marked as other controls in Fig. 4.6 correspond to some special control signals that are needed in dynamic RAMs and other types of memory chips. We shall describe these special signals while discussing the chips. The power supply lines generally consist of a line for the 5 V supply. ‘This is termed as the Voc. There is at least one line for the ground, termed as Vsg or simply GND. Generic chip numbers One often uses generic chip numbers such as 6264 or 511000 to refer to a class of chips. In reality, there is no chip with 6264 or 511000 as the part number. Hitachi, for example, has over 24 SRAMs with part numbers starting with 6264. All these chips, however, are 8K x 8 bit static RAMs and have the same pin configuration. They differ in their timing details, In this chapter, we will often refer to a chip by its generic name rather than by a specific part number. Whenever we refer to the timing parame- ters, we will mention which part. number the parameters correspond to, if the distinction is necessary. 4.3 STATIC RAMs In this section we examine the organization and timing of selected static RAM chips, We have selected the 6264. ‘Toshibe TMM2064 8K x 8 and Hitachi HM6264 are pin compatible chips. In addition, we shall briefly describe the 6116 (2K x 8) and the 62256 (32K x8) static RAMs. The concepts we illustrate using the 6264 are quite general and should be suffi- cient for understanding the timings of other static RAMs. 140 SEMICONDUCTOR MEMORIES Figure 4.7: Signals in 6264 static RAM. Organization of 6264 Fig. 4.7 shows the signals found on the 6264 static RAM. There are thirteen address inputs denoted by Ao-Ai2. This enables accessing of any one of the 8K bytes within the chip (2! = 8096). The eight I/O pins are-for data transfer to and from the chip. The 6264 is in the standby mode when O61, or CSz, is inactive. The OS; and CS, act as two chip enable or chip sclect signals for the 6264. Note that CS, being inactive cortesponds to the input level of logic 1, and CS2 being inactive corresponds to input at logic. 0. In the standby mode, the chip draws only 10 mA of current which is about one-eights of its normal operating current. An active WE input indicates a write operation. An active OF is needed during the read operation. It enables the data output lines of the chip to place data on the system data bus. While OF is inactive, the data output lines are in the high impedance state. The 6264 operates from a single 5 V supply. It needs one ground con- nection (GND pin). There is one unconnected pin, marked N.C. in Fig 4.7. This is used in the 62256 static RAM. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. STATIC RAMS 143 Table 4.2: Read Cycle Parameters for the HM6264P-10 [et een [as | tes] (ns) [aa | Address acces time did | [cor _[ CS: aceesstime «dt ——~it (| oe Se ET chan data hold tome from address change teat 1 OF CS active to data output in low impedance tiz2 tHat 1 or CS: inactive to data output in high impedance active to data output in low impedance OF inactive to data output in high impedance Write cycle of 6264 The write cycle timings of the 6264 are shown in Fig. 4.9. The write operation begins with the »P placing the address of the word into which data is to be written. From this point onwards, the write operation can be controlled by, e WE control input, or e CS; chip select input, or © CS, chip select input. Fig. 4.9 exhibits the CS; controlled write operation. The parameter values listed in Table 4.2, are independent of this control . Once the address has been placed on the address inputs, the chip enable inputs are asserted after a minimum of tas time units. This is followed by asserting the write control input, WE. ‘The data to be written may be placed on the data input lines of the chip any time before or after the write input: is enabled. However, it takes a minimum of tps time units for data set-up. Thus, the data should be available on the data input lines for this much time before WE is negated. After has been negated, the data must be held on the input lines for a minimum of tpy time units. The value of this parameter is 0 for many static RAMs. The write cycle ends when WE is negated. 144 SEMICONDUCTOR MEMORIES Figure 4.9: Write cycle of 6264. ‘The write cycle time is denoted by twc. This implies that data can be written at a maximum rate of twc time units per word. Parameter twa indicates the time that must elapse from the negation of WB to the change in address inputs. This parameter, known as the write recovery time, generally bas a value of Ons, implying that the address can change if the two requirement has beeri satisfied. The write pulse width must be a minimum of twp time units wide. Note that OF is inactive during the write cycle, and therefore the data out lines of the chip are in the high impedance state. STATIC RAMS = 145 Table 4.3: Write Cycle Parameters for the HM6264P-10 Write pulse width Write recovery time from OS; and WE data to write time overlap Output data hold time till write negation Output active from end of write GE to output in high impedance WE inactive to data output in high impedance twuz Example 4.4 The write cycle timing parameters for the HM6264P-10 are listed in Ta- bie 4.3. It is clear from this table that data can be written into the chip at the maximum rate of one byte every 100 ns. Note that this is the same as the read cycle time. The write pulse width, tw p must be at least 60 ns. These two values, together with two, decide at most how much can the uP delay the placement of output data on the data input lines of the chip. The read and write operations can be carried out in any order so long as the timing constraints described above are satisfied. As both the read and write cycle times are 100 ns, we can say that this chip can be used for data I/O at a maximum rate of 10 MHz (1/10MHz = 100ns). a 4.3.1 Other Static RAMs Besides the 64K bit RAM examined above, Fig. 4.7 exhibits the signals for two other static RAMs, namely the 6116 and the 62256. The 6116 is a 2K x8 RAM with cycle times varying from 90 ns to 150ns, The 62256 is a 32K x8 static RAM with cycle times varying from 85ns to 120ns. Which of these chips is selected by a designer will depend on several factors outlined earlier. Note that the 62256 requires a total of 14 address lines. To maintain a 28 pin package, the designers of the chip had to forego one of the chip 148 SEMICONDUCTOR MEMORIES select inputs. Thus, the 62256 uses only one chip select input. 4.3.2 Fast Static RAMs In certain applications, such as the design of cache memories, one needs a much faster RAM than the ones described earlier. The 6264 chip can operate at about 10 MHz data transfer rate. There are several yPs available now which operate at above 25MHz. Some chips, like the Clipper from Intergraph® Inc. operate at 50MHz. Several of these Ps need memories with access times of less than 50 ns. ‘The Advanced Micro Devices’ AM2168, Hitachi’s HM 6168, and several others, are 4K x 4 bit high-speed static RAMs. Their read and write cycle times range between 25 ns - 45ns. Such high speed chips are often used in the design of cache memory subsystems for most 32-bit 4P-based systems. 4.4 DYNAMIC RAMs In this section we shall examine the organization and timings of selected dynamic RAM chips. We have selected the 511000, 511001, and 511002 chips. Each one of these is a 1M bit dynamic RAM, with the only differ- ence being in the accessing modes they provide. The accessing modes are described later in this section. We shall refer to all these three chips as 51100x, x=0, 1, or 2. 4.4.1 Organization of 51100x As shown in Fig. 4.10, these chips are 18-pin devices. The 51100xJP series chips are 20-pin devices with two additional unconnected pins. Each one of these chips has 10 address inputs labeled Ao-Ag. To address 1M bits, a total of 20 address bits are needed. Therefore, the address of each word to be accessed is sent to the chip in a two 10-bit sequence as is described later in this section. This is also known as address multiplezing. Most dynamic RAMs use some form of address multiplexing in order to conserve the number of pins needed. ‘There is one fine for data input, Dy, and one for data output, Doyr- These two lines are sufficient to access a 1-bit word. The WRITE input is used for indicating a read or a write operation. The chip operates from a single 5 V supply input at the Vcc pin. The ‘TF input is used for testing the chip as described later. Internally, the 1 M cells are organized as a matrix consisting of 512 rows of 2048 cells each. Fig 4.11 shows the partial internal organization of a dynamic RAM as it appears in the Toshiba 511000P series. To select any cell within the chip, the #P first sends 10 bits of the 20 bit address. These could be the low order 10 bits of the 20 bit address. 5 Original, lipper was introduced by Fairchild Inc. DYNAMIC RAMS 147 Din — Vss WE — Dour RAS, +— cas Ne 811000 | a, Ao i Ag Ay I-— Az Ag Is— Ag Ag la— Ag Vee J Ag Figure 4.10: Organization of 511000 series. These bits are held internally in ten row address buffers. The nine least significant of these 10 bits are used by the chip internally to select one of the 512 rows. This is followed by the »P sending the remaining 10 bits of the address. This part of the column address is stored in the column address buffers. These 10 bits, together with the 1 bit that was not used during row selection, are used for selecting the desired column within the selected row, Once the addressed cell in a row and column has been selected, the read or write operation, is performed. Note that the internal organization of Hitachi 511000 series is slightly different from the organization described above. The row and column addresses are strobed into the internal buffers using the RAS and OAS inputs, respectively. In case of the 511002, the CAS is substituted by the chip select input, CS. 4.4.2 Timings of 51100x In this section we shall examine the read, write and the refresh cycles of the 51100x. The timing parameters of the TC511000P/J/Z85 chip are listed in Table 4.4. Read cycle As shown in Fig. 4.12, the read cycle begins with the uP placing the 10- bit row address at the A-Ag inputs. This address is strobed into the row buffers located inside the chip by activating the RAS input. The row 148 SEMICONDUCTOR MEMORIES Column Decoder (i-bits) Column Address > Butters (10) Sense Amplifier Memory Ceti Matrix Buffers (10) +>} 512 x 2648 Figure 4.11: Internal organization of a 1M x1 bit dynamic RAM. address set up time, taspr, is Ons for this chip. This implies that the row address can be placed on the address inputs at the same time as RAS is activated. However, the row address must be valid till at least tray time units, which is 20 ns for the chip we are examining. ‘The column address is strobed into the column address buffers by plac- ing the 10 bits of the address on the address inputs and activating the CAS input. Note that the RAS input remains active during column address strobe. The column address set up time, tasc, is Ons. At this point, an inactive WRITE indicates that the pP desires a read operation. The read operation begins and the data is available at the Dour line after t44 time units which is specified to be at most 45 ns. ‘The column address should be placed on the address bus only after a specified minimum of trap time units, which is 20 ns as in Table 4.4. This parameter also has a maximum value, 40 ns in our case. If this maximum requirement is not met, then the access time specification trac cannot be met, implying that the memory will operate slower than its rated maximum. ‘The read cycle terminates at most of torr time units after the CAS has been negated. A new read cycle can begin aiter the RAS has been negated DYNAMIC RAMS 149 Table 4.4: Read and Write Cycle Parameters for the TC511000P/J/Z85 Symbol Description Random read and write cycle time RAS precharge time RAS to OAS delay time AS pulse width AS to RAS precharge time Column address hold time with ref- erence to RAS to column address delay Column address set up time Column address to RAS lead time Column address hold time Row address hold time Row address set up time Read command set up time Write command set up time 0 taa Access time from column address and then activated once again. However, the RAS must remain negated for at least trp time units, 70 ns in our case. Thus, in the best case, data can be read from a dynamic RAM at the mazimum rate of one word every tre time units. You are urged to examine all the parameters listed in Table 4.4 before proceeding further. Write cycle The write cycle of the 51100x is similar to the read cycle we examined above. Fig 4.13 exhibits: the write cycle for the 51100x. The timing parameters are listed in Table 4.4. The write operation is indicated by asserting the WRITE signal. This can be done immediately after the column strobe has been activated. ‘The data to be written at the addressed word can be placed on the Dry input prior to activating the column strobe, Notice that both twos and tps have a minimum value of 0 ns for the chip we are examining. The input data must remain valid on the Dyy input for at least tp time units which is 20ns for TC511000P/J/Z85. The data must also be valid for at least tpwa time units from the time the RAS is activated. Note that during the 150 SEMICONDUCTOR MEMORIES Table 4.5: Read and Write Cycle Parameters for the TC511000P/J/Z85 (continued) Description Access time from RAS texrz AS to output in low impedance Output buffer turn off delay Read command hold time Symbol ¢ (eae | Access time from CAS Write command hold time Read command hold time referenced to Write command to CAS lead time | tawz | Write command to RAS lead time [ tw | Write command pulse width twor Write command hold time refer- enced to RAS [oe | Date st wp ne [own | Data hold time referenced to RAS_| [nase | Page mode RAB pole width| write cycle, the Dour signal from the chip is in the high impedance state. 4.4.3 Refreshing the Dynamic RAM As was mentioned earlier, the data in each cell of a dynamic RAM is held only for a short period of time after it is-written there. This time varies typically from 2-8 ms. In order to retain data in the cells, the cells need to be refreshed once every few milli-seconds. For the 51100x, the trer parameter (= 8ms) specifies the maximum time period during which all the cells must be refreshed at least once. It may, however, be more reliable to design a refresh circuit that performs refresh every 6 or 7 ms. RAS only refresh The refresh operation can be carried out in several ways. The simplest method to refresh all cells of a dynamic RAM is known as the RAS only refresh. The timing for RAS only refresh cycle is shown in Fig. 4.14. ARAS refresh cvcle begins with the assertion of the RASinput.TheCAS input is held high. The WRITE input may be high or low. The Ao-Ao inputs are set to the row address. This causes all words (or cells) in that row to be refreshed. A refresh cycle can be initiated once every tac time aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 152. SEMICONDUCTOR MEMORIES Pout Figure 4.13: Write cycle of the 51100x dynamic RAM. urged to examine the relevant memory data books to learn about some of these modes. Burst and distributed refresh ‘The yP cannot read or write using the chip that is executing refresh cycles. Thus, if the memory system, consisting of all DRAM’s is to be refreshed once every 8ms, the pP will have to wait for the entire duration of the refresh operation. The entire refresh operation may last as long as 84 ys for large DRAM’s or as short as 424s for smaller sized DRAM’s. The actual refresh time depends, among other factors, on the read cycle time and the memory size, This method of refreshing the entire memory before the pP can resume normal access, is known as burst or concentrated refresh. Fig. 4.15(a) shows this scheme. Another way to refresh the memory is by using distributed refresh. In this case, the refresh control circuit performs one refresh cycle in trer time DYNAMIC RAMS 153 LEE YG Figure 4.14: RAS only refresh cycle. period. However, the refresh cycles are distributed over time as shown in Fig. 4.15(b). For example, if trey = 8ms, then fora 1M DRAM with cells organized as 512 rows, one refresh cycle can be executed once every 15.4 ps. It is obvious that refreshing slows down the P, though not significantly. ‘When using burst refresh, there will be long time intervals during which the #P cannot respond to any critical external event. Such external events, also known as interrupts, are described in Chapter 6. In applications requiring rapid response to critical applications, distributed refresh, instead of burst refresh should be used. Which part of the circuit performs the refresh? In the simplest case, the yP itself can perform refresh, Some yPs, such as the Zilog Z80 or the Mostek 6502, have one or more pins dedicated for memory refresh. Such a facility alleviates the need for additional circuitry required for controlling the refresh operation. Chapter 5 describes these chips and the refresh signals. More often, memory refresh is performed by special dynamic RAM controllers such as the Intel 8208 or by a custom refresh control circuit. Pseudo Static RAMs and Automatic Refresh Pseudo static RAMs (abbreviated as PSRAMs) are dynamic RAMs with built in refresh logic. Use of such chips significantly reduces the complexity of the circuitry required for the refresh operation. Examples of PSRAMs include Hitachi’s HM658128 and Toshiba’s TC51832. Fig 4.16 shows the pinout of the 658128 PSRAM. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. DYNAMIC RAMS 155 aren [1 22 | Vee aw | 2 1d As an fo 30 [es Aw | 4 29 | WE ay [5 958128 os | ayy Ae |e alae as ft 28 | Ag aa fe 2s [an Ay | 2 | OE a2 | 0 23 | Ato a | ale ao | 2 an | 10, uiog | 19 20 | v0, vo, | 19 | v5 voz | 15 18 | 10g Ves | 16 17 | Woy Figure 4.16: Pinout of the Hitachi HM658128 PSRAM. 4.4.4 Page Mode Operation of Dynamic RAMs In several instances, the P addresses data or instructions from sequential memory locations. Suppose that 100 words are to be read. The first word is located at address 1024, the next at location (1024+ 512), the next one at location (1024 + 512 + 512) and so on. It would require a total of 16.5 us to complete this operation using a dynamic RAM with 165ns read cycle time. Notice that the least significant 9 bits of all the addresses to be sent to the memory are the same. The addresses are different only in the most significant 11 bits. The above mentioned read operation can be performed faster, if the page mode read operation is used instead of using the RAS read that was described earlier. In the page mode read, the first read cycle is similar to the RAS read cycle. Thus, the row address is sent to the memory, and RAS” is asserted. Then, the column address is sent to the memory, CAS asserted, and WRITE negated. This causes one location to be read: After one location has been read, then, if page mode read is used, the RAS remains asserted and CAS is negated. After a time period equal to the column address precharge time, tcp, CAS is asserted again and the next column address is placed on the address bus. Note that the row address remains the same as was supplied in the first read cycle. This time the aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. DYNAMIC RAMS 157 inn , m/ ; —_— y a aon as , ‘ase 7 rw ia line ie Aearete row |X| coum ie we ti . 1618 Bae ore \ t, Dour ot Powe Pout: Figure 4.17: Nibble mode read operation. is similar to the page mode read-write cycle except that the read and write take place for a complete nibble. 4.4.6 Static Column Mode Some dynamic RAMs offer the static column mode of operation. Toshiba’s TC511002 and Hitachi’s HM511002 are two such chips. These chips replace the TAS input with the CS input, where CS stands for chip select. This mode of operation provides read and write operations faster than in page mode. Just as in page and nibble mode, one can now use static column read cycle, static column write cycle and static column read-write cycle. We shall describe only the static column read cycle. Fig. 4.18 shows the timing relationship of various signals in static column mode read. The cycle begins with the placement of the row address at the address inputs and the assertion of the RAS strobe. This is followed by latching the column address input and asserting the CS input. Assertion of the chip select causes the column address to be latched inside the chip. WRITE is negated to indicate the read operation. The data read is available on the Dov lines after the access time requirements have been satisfied. While the row and column address strobes are asserted, the column address can be cycled once every tsc time units, which is at least 40 ns for the Toshiba chip and 45ns for the Hitachi chip. Thus, once the first aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. REPROGRAMMABLE ROMS 159 rewritten. A hard error is one that cannot be corrected. Thus, a chip with a hard error needs to be discarded. A chip in which a soft error has been reported can continue to operate in the system. Dynamic RAMs are well known for soft errors. The major source of these errors has been traced to the a-particles that are emitted by radioac- tive substances. These a-particles are emitted by the uranium and thorium present in small quantities in the packaging materials. They induce addi- tional electron-hole pairs in the silicon substrate causing the data in a cell to be reversed. Fortunately, the error rates are very low and the chip re- liability has been improved by manufacturers by improvements such as in packaging materials and cell circuitry. 45 REPROGRAMMABLE ROMs The static and dynamic memories described above are volatile memories. Thus, when electrical power is removed from these chips, they lose the stored information. In many applications, this is certainly not a desirable feature. Imagine what would happen if the program that loads the oper- ating system, and other useful programs, from the disk into the primary memory, was itself stored in a volatile memory and there was a power fail- ure? Non-volatile memories are very useful in situations where we would not like the information to be destroyed when the power to the memory is removed. Disks and tapes are non-volatile memories too. However, these are too slow to serve as primary memories in a :P based system. Read Only Memories that are erasable, have been the most popular non-volatile semiconductor memories amongst designers. ‘This is because data stored in these chips can be easily and inexpensively altered. Chips that permit erasure of data by exposure to ultraviolet light and can then be reprogrammed, are known as UV EPROMs. These are generally or- ganized as a sequence of bytes. The Intel 2716 has been one of the most popular EPROMs during the early days of uP development. The 2716 is a 2K x 8 chip. As technology has advanced, the density of EPROMs has also increased. Some of-the densest EPROMs available now are the Hitachi’s HN 270301 and Toshiba’s T'C571001. Both of these chips are organized as 128K x 8. We shall examine the timings of Hitachi’s 27512, 64K x 8 UV EPROM. 4.5.1 Organization of EPROMs Fig. 4.19 shows the pinouts for EPROMs as small as 2K x 8 to EPROMs as large as 512K x 8. These pinouts represent an industry standard. Chips from Intel, Toshiba, Hitachi, and several other manufacturers follow this standard. Note that the smaller chips are 26-pin devices and the larger

Вам также может понравиться