Вы находитесь на странице: 1из 65

For More Ignou Solved Assignments Please Visit - www.ignousolvedassignments.com Connect on Facebook : http://www.facebook.

com/pages/IgnouSolvedAssignmentscom/346544145433550 Subscribe and Get Solved Assignments Direct to your Inbox : http://feedburner.google.com/fb/a/mailverify?uri=ignousolvedassignments_com

Code Course Title Assignment Number Maximum Marks Weightage

: : : : :

MCS-012 Computer Organisation and Assembly Language Programming MCA(I1)/012/Assign/12 100 25%

Note: There are four questions in this assignment, which carries 80 marks. Rest 20 marks are for viva voce. You may use illustrations and diagrams to enhance the explanations. Please go through the guidelines regarding assignments given in the Programme Guide for the format of presentation. Answer to each part of the question should be confined to about 300 words.
Question 1 : (Covers Block 1) (a) Perform the following arithmetic operations using binary signed 2 s complement notation for integers. You may assume that the maximum size of integers is of 12 bits including the sign bit. (Please note that the numbers given here are in decimal notation) (3 Marks) i) Add - 512 and 298 ii) Subtract 512 from - 64 ii) Add 1025 and 1023 Please indicate the overflow if it is occurs. (b) Convert the hexadecimal number: AB CD EF into binary, octal and decimal equivalent. Solution: Convert into binary no. A=1010 B=1011 C=1100 D=1101 E=1110 F=1111 (ABCDEF)16=(101010111100110111101111)2

Convert into decimal:Ax16^0+Bx16^1|+Cx16^2+Dx16^3+Ex16^4+Fx16^5 10x16+11x16^1+12x16^2+13x16^3+14x16^4+15x16^5

Created by STUDENT HELPLINE Mob.: 8006462157

10+176+3072+53248+917504+15728640 =(16782650)2

Convert into octal:A B C D E F

1010 1011 1100 1101 1110 1111 Regroup it into 3 digit 101 010 111 100 110 111 101 111

Now place equivalent octal number 4 2 7 3 6 7 4 7

=(42736747)8

(c) Convert the following string into equivalent "UTF 8" code "Copyright sign is and you must check it prior to using copyrighted material". Are these codes same as that used in ASCII? (d) Design a logic circuit that takes a four digit binary input, counts the number of 1s in it, and produces it as the output. For example, if the input is 1101, then output will be 11 (as there are three ones in the input). Draw the truth table and use K-map to design the Boolean expressions for each of the output bits. Draw the resulting circuit diagram using AND - OR - NOT gates. (e ) Design a two bit counter (a sequential circuit) that counts as 0, 2, 0, 2... and so on. You should show the state table, state diagram, the k-map for circuit design, logic diagram of the resultant design using D flip-flop

(f) Design a floating point representation of 16 bits closer to IEEE 754 format. The number should have a biased exponent of 5 bits. You may assume that the mantissa is in normalised form; the exponent bias of 15; and one bit is used for the sign bit in the mantissa. Represent the number (24.125) 10 using this format .

Question 2 (Covers Block 2) (a) A RAM has a capacity of 32 K 16. (i) How many data input and data output lines does this RAM need to have? (ii) How many address lines will be needed for this RAM? Ans: 32K = 32 1024 = 32768 words. Hence, there are 32768 memory addresses. Since 65536 = 216 it requires 16 bits address code to specify one of 32768 addresses.

(b) Consider a RAM of 512 words with a word size of 32 bits. Assume that this memory have a cache memory of 8 Blocks with block size of 64 bits. For the given memory and Cache in the statements as above, draw a diagram to show the address mapping of RAM and Cache, if two way set associative memory to cache mapping scheme is used. Main memory Size = 64 Words Main Memory word size = 16 bits Cache Memory Size = 8 Blocks Cache Memory Block size = 32 words Block of Cache = 2 Words of RAM 1 Memory location address 25 is equivalent to Block address 12. Total number of possible Blocks in Main Memory = 64/2 = 32 blocks

(i) Associative Mapping: The block can be anywhere in the cache. (ii) Direct Mapping: Size of Cache = 8 blocks Location of Block 12 in Cache = 12 modulo 8 = 4 (iii) 2 Way set associative mapping: Number of blocks in a set = 2 Number of sets = Size of Cache in blocks / Number of blocks in a set =8/2=4 Block 12 will be located anywhere in (12 modulo 4) set, that is set 0.

(c) Explain which of the Input/output techniques that will be used for the following operations. Also explain the I/O techniques. (i) Reading data from a keyboard (ii) Reading data from a file. Solution (i) Reading data from a Keyboard

In computing, input/output or I/O is the communication between an information processing system (such as a computer) and the outside world, possibly a human or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it. The term can also be used as part of an action; to "perform I/O" is to perform an input or output operation. I/O devices are used by a person (or other system) to communicate with a computer. For instance, a keyboard or a mouse may be an input device for a computer, while monitors and printers are considered output devices for a computer. Devices for communication between computers, such as modems and network cards, typically serve for both input and output. Note that the designation of a device as either input or output depends on the perspective. Mouse and keyboards take as input physical movement that the human user outputs and convert it into signals that a computer can understand. The output from these devices is input for the computer. Similarly, printers and monitors take as input signals that a computer outputs. They then convert these signals into representations that human users can see or read. For a human user the process of reading or seeing these representations is receiving input. These interactions between computers and humans is studied in a field called human-computer interaction. In computer architecture, the combination of the CPU and main memory (i.e. memory that the CPU can read and write to directly, with individual instructions) is considered the brain of a computer, and from that point of view any transfer of information from or to that combination, for example to or from a disk drive, is considered I/O. The CPU and its supporting circuitry provide memory-mapped I/O that is used in low-level computer programming, such as the implementation of device drivers. An I/O algorithm is one designed to exploit locality and perform efficiently when data reside on secondary storage, such as a disk drive.

Interface
An I/O interface is required whenever the I/O device is driven by the processor. The interface must have necessary logic to interpret the device address generated by the processor. Handshaking should be implemented by the interface using appropriate commands (like BUSY, READY, and WAIT), and the processor can communicate with an I/O device through the interface. If different data formats are being exchanged, the interface must be able to convert serial data to parallel form and vice-versa. There must be provision for generating interrupts and the corresponding type numbers for further processing by the processor if required. A computer that uses memory-mapped I/O accesses hardware by reading and writing to specific memory locations, using the same assembly language instructions that computer would normally use to access memory.

Higher-level implementation
Higher-level operating system and programming facilities employ separate, more abstract I/O concepts and primitives. For example, most operating systems provide application programs with the concept of files. The C and C++ programming languages, and operating systems in the Unix family, traditionally abstract files and devices as streams, which can be read or written, or sometimes both. The C standard library provides functions for manipulating streams for input and output. In the context of the ALGOL 68 programming language, the input and output facilities are collectively referred to as transput. The ALGOL 68 transput library recognizes the following standard files/devices: stand in, stand out, stand errors and stand back. An alternative to special primitive functions is the I/O monad, which permits programs to just describe I/O, and the actions are carried out outside the program. This is notable because the I/O functions would introduce side-effects to any programming language, but this allows purely functional programming to be practical.

Addressing mode
There are many ways through which data can be read or stored in the memory. Each method is an addressing mode, and has its own advantages and limitations. There are many type of addressing modes such as direct addressing, indirect addressing, immediate addressing, index addressing, based addressing, based-index addressing, implied addressing, etc.

Direct addressing
In this type of address of the data is a part of the instructions itself. When the processor interprets the instruction,] it gets the memory address from where it can be read/written the required information. For example:[1
MOV register , [address ] ; to read MOV [address ], register ; to write ; similarly IN regi ster , [address ] ; to read as input OUT [address ], register ; to write as output

Here the address operand points to a memory location which holds the data and copies it into/from the specified register. A pair of brackets is a dereference operator.

Indirect addressing
According to the above example, the address can be stored in another register. Therefore, the instructions will have the register representing the address. So to fetch the data, the instruction must be interpreted appropriate register selected. The value of the register will be used for addressing appropriate memory

location and then data will be read/written. This addressing method has an advantage against the direct mode that the register value is changeable so the appropriate memory location can also be dynamically selected.
(ii) Reading data from a file.

A data file is a computer file which stores data to use by a computer application or system. It generally does not refer to files that contain instructions or code to be executed (typically called program files), or to files which define the operation or structure of an application or system (which include configuration files, directory files, etc.); but specifically to information used as input, or written as output by some other software program. This is especially helpful when debugging a program. Most computer programs work with files. This is because files help in storing information permanently. Database programs create files of information. Compilers read source files and generate executable files. A file itself is a bunch of bytes stored on some storage device like tape, magnetic disk, Optical disk etc. The data files are the files that store data pertaining to a specific application, for later use.

Storage types of Data file


The data files can be stored in two ways:
1. Text files. 2. Binary files.

A text file (also called ASCII files) stores information in ASCII characters. A text file contains visible characters. We can see the contents of file on the monitor or edit it using any of the text editors. In text files, each line of text is terminated,(delimited) with a special character known as EOL (End of Line) character. In text files some internal translations take place when this EOL character is read or written. Examples of text files
A file containing a C++ program

A binary file is a file that contains information in the same format in which the information is held in memory i.e. in the binary form. In binary file,there is no delimiter for a line. Also no translations occur in binary files. As a result,binary files are faster and easier for a program to read and write than the text files. As long as the file doesn't need to be read or need to be ported to a different type of system,binary files are the best way to store program information. Examples of binary files
An executable file An object file

In C++, a file, at its lowest level, is interpreted simply as a sequence, or stream, of bytes. One aspect of the file I/O library manages the transfer of these bytes. At this level, the notion of a data type is absent. On the other hand, file, at user level, consists of a sequence of possibly intermixed data types - characters, arithmetic values, class objects. A second aspect of file I/O library manages the interface between these two values.

Stream
A stream is a sequence of bytes. A stream is a general name given to a flow of data. Different streams are used to represent different kinds of data flow. Each stream is associated with a particular class,which contains member functions and definitions for dealing with that particular kind of data flow. The stream that supplies data to the program in known as input stream. It reads the data from the file and hands it over to the program. The stream that receives data from the program is known as output stream. It writes the received data to the file.Following figure illustrates it.

File Input and Output using streams[1]

When the main function of your program is invoked, it already has three predefined streams open and available for use. These represent the "standard" input and output channels that have been established for the process.

The fstream.h header file


In C++, file input/output facilities are implemented through a component header file of C++ standard library. This header file is fstream.h. The fstream library predefines a set of operations for handling file related input and output. It defines certain classes that help one perform file input and output. For example, ifstream class ties a file to the program for input; ofstream class ties a file to the program for output; and fstream class ties a file to the program for both input and output. The classes defined inside fstream.h derive from classes under iostream.h, the header file that manages console I/O operations in C++. Following figure shows the stream class hierarchy.

Stream class Hierarchy

The functions of these classes have been summarised in the following table -

Functions of File Stream Classes

Opening and Closing files in C++


In C++, opening of files can be achieved in two ways 1. Using the constructor function of the stream class. 2. Using the function open().

The first method is preferred when a single file is used with a stream, however, for managing multiple files with the same stream, the second method is preferred.
Opening files using Constructors
ifstream input_file ("DataFile" );

The data being read from DataFile has been channelised through the input stream as shown:

Data being read from 'datafile' using an input stream

The above given statement creates an object (input_file) of input file stream. The object name is a user defined name. After creating the ifstream object input_file, the file DataFile is opened and attached to the input stream input_file. Now both, the data being read from DataFile has been channelised through the input stream object. The connections with a file are closed automatically when the input and output stream objects expire i.e., when they go out of scope. (For instance, a global object expires when the program terminates). Also you can close a connection with a file explicitly by using close() method
input_f ile. close () ;

Closing such a connection does not eliminate the stream, it just disconnects it from the file. The stream still remains there. Closing a file flushes the buffer which means the data remaining in the buffer (input or output stream) is moved out of it in the direction it is ought to be.
Opening files using open() function
ifstream filin ; filin. open ("Master.dat" ); file Master.dat .. filin .close () ; filin. open ("Tran.dat" ); Tran.dat .. filin .close () ;

A stream can be connected to only one file at a time.

//create an input stream //associate '''filin''' stream with //process Master.dat

//terminate association with Master.dat //associate '''filin''' stream with file //process Tran.dat //terminate association

The concept of File Modes


The filemode describes how a file is to be used - to read from it, to write to it, to append it, and so on.
stream_object. open ("filename" , (filemode ) );

The following table lists the filemodes available and their meaning:

(d) Find the average disk access time that reads or writes a 1024 byte sector. Assume that the disk rotates at 18000 rpm; each track of the disk has 128 sectors and data transfer rate of the disk is 100 MB/second. (Please calculate data transfer time, assume a suitable seek time and calculate the average latency time). (e) What is the purpose of FAT in Windows? What construct do you use in Linux/Unix instead of FAT? Explain the differences between the two. Solution: The FAT maps the usage of data space of the disk. It contains information about the space used by each individual file, the unused disk space and the space that is unusable due to defects in the disk. Since FAT contains vital information, two copies of FAT are stored on the disk, so that in case one gets destroyed, the other can be used. A FAT entry can contain any of the following:

unused cluster

reserved cluster bad cluster last cluster in file next cluster number in the file.
In the UNIX system, the information related to all these fields is stored in an Inode table on the disk. For each file, there is an inode entry in the table. Each entry is made up of 64 bytes and contains the relevant details for that file. These details are: a) Owner of the file b) Group to which the Owner belongs c) File type d) File access permissions e) Date & time of last access f) Date & time of last modification g) Size of the file h) No. of links i) Addresses of blocks where the file is physically present.

(f) Define each of the following term. Explain the main purpose / use / advantage. (Word Limit for answer of each part is 50 words ONLY) (i) ZBR in the context of disks (ii) SCSI (iii) Colour Depth (iv) Graphics Accelerators (v) Monitor Resolution (vi) Active matrix display

Solution: (i) ZBR in the context of disks

The inner tracks are packed as densely as the particular drive's technology allows, but with a CAV drive the data on the outer tracks are less densely packed. Using ZBR the drive divides all the tracks into a number of zones, and the inner track of each zone is packed as densely as it can, with the other tracks in that same zone recorded with the same read/write rate. This permits the drive to have more bits stored in each track outside of the innermost zone than drives not using this technique. Storing more bits per track equates to achieving a higher total data capacity on the same disk area. On a hard disk using ZBR, the data on the tracks in the outer most zone will have the highest data transfer rate. Since both hard disks and floppy disks typicallly number their tracks beginning at the outer edge and continuing inward, and since operating systems typically fill the lowest-numbered tracks first, this is where the operating system typically stores its own files during its initial installation onto an empty drive. Testing disk drives when they are new or empty after defragmenting them with some benchmarking applications will often show their highest performance. After some time, when more data is stored in the inner tracks, the average data transfer rate will drop, because the transfer rate in the inner zones is slower; often making people think their disk drive is slowing down over time. Some other ZBR drives, such as the 800 kilobyte 3.5" floppy drives in the Apple IIGS and older Macintosh computers, don't change the data rate but rather spin the medium faster when reading or writing outer tracks, thus approximating constant linear velocity drives
(ii) SCSI

Small Computer System Interface (SCSI, is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it can connect a wide range of other devices, including scanners and CD drives, although not all controllers can handle all devices. The SCSI standard defines command sets for specific peripheral device types; the presence of "unknown" as one of these types means that in theory it can be used as an interface to almost any device, but the standard is highly pragmatic and addressed toward commercial requirements. SCSI is an intelligent, peripheral, buffered, peer to peer interface. It hides the complexity of physical format. Every device attaches to the SCSI bus in a similar manner. Up to 8 or 16 devices can be attached to a single bus. There can be any number of hosts and peripheral devices but there should be at least one host. SCSI uses handshake signals between devices, SCSI-1, SCSI-2 have the option of parity error

checking. Starting with SCSI-U160 (part of SCSI-3) all commands and data are error checked by a CRC32 checksum. The SCSI protocol defines communication from host to host, host to a peripheral device, peripheral device to a peripheral device. However most peripheral devices are exclusively SCSI targets, incapable of acting as SCSI initiatorsunable to initiate SCSI transactions themselves. Therefore peripheral-to-peripheral communications are uncommon, but possible in most SCSI applications. The Symbios Logic 53C810 chip is an example of a PCI host interface that can act as a SCSI target.

(iii) Colour Depth

color depth or bit depth is the number of bits used to indicate the color of a single pixel in a bitmapped image or video frame buffer. This concept is also known as bits per pixel (bpp), particularly when specified along with the number of bits used. Higher color depth gives a broader range of distinct colors. Color depth is only one aspect of color representation, expressing how finely levels of color can be expressed; the other aspect is how broad a range of colors can be expressed

(iv) Graphics Accelerators Graphics accelerators have their own memory, which is reserved for storing graphical representations. The amount of memory determines how much resolution and how many colors can be displayed. Some accelerators use conventional DRAM, but others use a special type of video RAM (VRAM), which enables both the video circuitry and the processor to simultaneously access the memory.

bus : Each graphics accelerator is designed for a particular type of video bus. As of 1995, most are designed for
the PCI bus.

register width: The wider the register, the more data the processor can manipulate with each instruction. 64-bit
accelerators are already becoming common, and we can expect 128-bit accelerators in the near future. (v) Monitor Resolution

The display resolution of a digital television or display device is the number of distinct pixels in each dimension that can be displayed. It can be an ambiguous term especially as the displayed resolution is controlled by all different factors in cathode ray tube (CRT), flat panel or projection displays using fixed picture-element (pixel) arrays. It is usually quoted as width height, with the units in pixels: for example, "1024 768" means the width is 1024 pixels and the height is 768 pixels. This example would normally be spoken as "ten twentyfour by seven sixty-eight" or "ten twenty-four by seven six eight". One use of the term "display resolution" applies to fixed-pixel-array displays such as plasma display panels (PDPs), liquid crystal displays (LCDs), digital light processing (DLP) projectors, or similar technologies, and is simply the physical number of columns and rows of pixels creating the display (e.g.,

1920 1080). A consequence of having a fixed-grid display is that, for multi-format video inputs, all displays need a "scaling engine" (a digital video processor that includes a memory array) to match the incoming picture format to the display. Note that the use of the word resolution here is a misnomer, though common. The term "display resolution" is usually used to mean pixel dimensions, the number of pixels in each dimension (e.g., 1920 1080), which does not tell anything about the resolution of the display on which the image is actually formed: resolution properly refers to the pixel density, the number of pixels per unit distance or area, not total number of pixels. In digital measurement, the display resolution would be given in pixels per inch. In analog measurement, if the screen is 10 inches high, then the horizontal resolution is measured across a square 10 inches wide. This is typically stated as "lines horizontal resolution, per picture height; for example, analog NTSC TVs can typically display about 340 lines of "per picture height" horizontal resolution from over-the-air sources, which is equivalent to about 440 total lines of actual picture information from left edge to right edge
(vi) Active matrix display

Alternatively referred to as Thin Film Transistor (TFT) and Active-matrix LCD (AMLCD), an activematrix display is a liquid crystal display (LCD) first introduced with the IBM ThinkPad in 1992. With active-matrix displays, each pixel is controlled by one to four transistors that can make the screen faster, brighter, more colorful than passive-matrix, and capable of being viewed at different angles. Because of this improved technology, active-matrix screens are often more expensive, but better quality than a passive matrix display. While active-matrix displays do have a crisp display because each pixel has its own transistor, you will find that when running off a battery that the power drains faster. Also, because of the number of transistors, there is more of an opportunity fordead pixels.
Question 3 (Covers Block 3) (a) Assume that a new machine has been developed which has only 16 general purpose registers, but have a big high speed RAM. The machine uses stack for procedure calls. The machine is expected to handle all the object oriented languages. List four addressing modes that must be supported by such a machine. Give justification of the selection of each of the addressing modes. Solution: A stack machine implements a stack with registers. The operands of the arithmetic logic unit (ALU) are always the top two registers of the stack and the result from the ALU is stored in the top register of the stack. 'Stack machine' commonly refers to computers which use a Last-in, First-out stack to hold short-lived temporary values while executing individual program statements. The instruction set carries out most ALU actions with postfix (Reverse Polish notation) operations that work only on the expression stack, not on data registers or main memory cells.

For a typical instruction like Add, both operands implicitly come from the topmost (most recent) values of the stack, and those two values get replaced by the result of the Add. The instruction's operands are 'popped' off the stack, and its result(s) are then 'pushed' back onto the stack, ready for the next instruction. Most stack instructions are encoded as just an opcode, with no additional fields to specify a register number or memory address or literal constant. This encoding is easily extended to richer operations with more than two inputs or more than one result. Integer constant operands are pushed by separate Load

Immediate instructions. All accessing of program variables in main memory RAM is segregated into separate Load or Store instructions containing one memory address or some way to calculate that address from stacked operands. The stack machine style is in contrast to register file machines which hold temporary values in a small fast visible array of similar registers, or accumulator machines which have only one visible general-purpose temp register, or memory-to-memory machines which have no visible temp registers. Some machines have a stack of very limited size, implemented as a register file and a dynamic register renumbering scheme. Some machines have a stack of unlimited size, implemented as an array in RAM accessed b2 [y] a 'top of stack' address register. Its topmost N values may be cached by invisible data registers for speed. A few machines have both an expression stack in memory and a separate visible register stack. Stack machines may have their expression stack and their call-return stack as separate things or as one integrated structure. Some technical handheld calculators use reverse Polish notation in their keyboard interface, instead of having parenthesis keys. This is a form of stack machine. The Plus key relies on its two operands already being at the correct topmost positions of the user-visible stack.

Advantages of Stack Machine Instruction Sets


Very Compact Object Code Stack machines have much smaller instructions than the other styles of machines. But operand loads are separate and so stack code requires roughly twice as many instructions as the equivalent code for register machines. The total code size (in bytes) is still less for stack machines. In stack machine code, the most frequent instructions consist of just an opcode and can easily fit in 6 bits or less. Branches, load immediates, and load/store instructions require an argument field, but stack machines often arrange that the frequent cases of these still fit together with thin opcode into a single byte or syllable. The selection of operands from prior results is done implicitly by the ordering of the stack ops. In contrast, register machines require two or three register-number fields per ALU instruction to select its operands; the densest register machines average about 16 bits per instruction. The instructions for accumulator or memory-to-memory machines are not padded out with multiple register fields. Instead, they use compiler-managed anonymous variables for subexpression values. These temps require extra memory reference instructions which take more code space than for the stack machine. All stack machines have variants of the load/store opcodes for accessing local variables and formal parameters without explicit address calculations. This can be by offsets from the current top-of-stack address, or by offsets from a stable frame-base register. Register machines handle this with a register+offset address mode, but use a wider offset field.

Dense machine code was very valuable in the 1960s, when main memory was very expensive and very limited even on mainframes. It became important again on the initially-tiny memories of minicomputers and then microprocessors. Density remains important today, for smart phone apps and for Java apps downloaded into browsers over slow internet connections. Density also improves the effectiveness of caches and instruction prefetch cycles. Density allows smaller ROMs in embedded applications. But extreme density often comes with compromised program performance. And extreme performance often requires code that is several times bigger than stack code. Some of the density of Burroughs B6700 code was due to moving vital operand information elsewhere, to 'tags' on every data word or into tables of pointers. The Add instruction itself was generic or polymorphic. It had to fetch the operand to discover whether this was an integer add or floating point add. The Load instruction could find itself tripping on an indirect address, or worse, a disguised call to a call-by-name thunk routine. The generic opcodes required fewer opcode bits but made the hardware more like an interpreter, with less opportunity to pipeline the common cases. Simple Compilers Compilers for stack machines are simpler and quicker to build than compilers for other machines. Code generation is trivial and independent of prior or subsequent code. It can be easily integrated into the parsing pass. No register management is needed, and no optimizations for constants or repeated simple memory references are needed (or even allowed). The same opcode that handles the frequent common case of Add, Indexed Load, or Function Call will also handle the general case involving complex subexpressions and nested calls. The compiler and the machine need not deal separately with corner cases. This simplicity has allowed compilers to fit onto very small machines. The simple compilers allowed new product lines to get to market quickly, and allow ]e4d new operating systems to be written entirely in a new high level language rather than in assembly. [3 [ ] The UCSD p-System supported a complete student programming environment on early 8-bit microprocessors with poor instruction sets and little RAM, by compiling to a virtual stack machine rather than to the actual hardware. The downside to the simplicity of compilers for stack machines, is that pure stack machines have not benefited much from subsequent advancements in compiler optimizer technology. However optimisation of compiled stack code is quite[5][6 pos]sible. Back-end optimisation of compiler output has been demonstrated to significantly improve code, and potentially performance, whilst global optimisation within the compiler itself achieves further gains. Simple Interpreters Some stack machine instruction sets are intended for interpretive execution of a virtual machine, rather than driving hardware directly. Interpreters for virtual stack machines are easier to build than interpreters for register or memory-to-memory machines; the logic for handling memory address modes is in just one place rather than repeated in many instructions. Stack machines also tend to have fewer variations of an opcode; one generalized opcode will handle both frequent cases and obscure corner cases of memory references or function call setup. (But code density is often improved by adding short and long forms for the same operation.)

Minimal Processor State A machine with an expression stack can get by with just two visible registers, the top-of-stack address and the next-instruction address. The minimal hardware implementation has few bits of flipflops or registers. Faster implementations buffer the topmost N stack cells into invisible temp registers to reduce memory stack cycles. Responding to an interrupt involves pushing the visible registers and branching to the interrupt handler. This is faster than storing most or all of the visible registers of a register machine, giving a quicker response to the interrupt. Some register machines deal with this by having multiple register files that can be instantly swapped but this increases its costs and slows down the register file.

Performance Disadvantages of Stack Machines


More Memory References Some in the industry believe that stack machines execute more data cache cycles for temporary values and local variables than do register machines. On stack machines, temporary values often get spilled into memory, whereas on machines with many registers these temps usually remain in registers. (However, these values often need to be spilled into "activation frames" at the end of a procedure's definition, basic block, or at the very least, into a memory buffer during interrupt processing). Values spilled to memory add more cache cycles. This spilling effect depends on the number of hidden registers used to buffer top-of-stack values, upon the frequency of nested procedure calls, and upon host computer interrupt processing rates. Some simple stack machines or stack interpreters use no top-of-stack hardware registers. Those minimal implementations are always slower than standard register machines. A typical expression like X+1 compiles to 'Load X; Load 1; Add'. This does implicit writes and reads of the memory stack which weren't needed:
Load X, push to memory Load 1, push to memory Pop 2 values from memory, add, and push result to memory

for a total of 5 data cache references. The next step up from this is a stack machine or interpreter with a single top-of-stack register. The above code then does:
Load X into empty TOS register (if hardware machine), or Push TOS register to memory, Load X into TOS register (if interpreter) Push TOS register to memory, Load 1 into TOS register Pop left operand from memory, add to TOS register and leave it there

for a total of 5 data cache references, worst-case. Generally, interpreters don't track emptiness, because they don't have toanything below the stack pointer is a non-empty value, and the TOS cache register is

always kept hot. Typical Java interpreters do not buffer the top-of-stack this way, however, because the program and stack have a mix of short and wide data values. If the hardwired stack machine has N registers to cache the topmost memory stack words, then all spills and refills are avoided in this example and there is only 1 data cache cycle, the same as for a register or accumulator machine. On register machines using optimizing compilers, it is very common for the most-used local variables to live in registers rather than in stack frame memory cells. This eliminates all data cache cycles for reading and writing those values, except for their initial load and final store upon procedure termination. The development of 'stack scheduling' for performing live-variable analysis, and thus retaining key variables on the stack for extended periods, goes a long way to answering this concern. On the other hand, register machines must spill many of their registers to memory across nested procedure calls. The decision of which registers to spill, and when, is made statically at compile time rather than on the dynamic depth of the calls. This can lead to more data cache traffic than in an advanced stack machine implementation. Factoring Out Common Subexpressions Has High Cost In register machines, a subexpression which is used multiple times with the same result value can be evaluated just once and its result saved in a fast register. The subsequent reuses have no time or code cost, just a register reference that would have happened anyhow. This optimization wins for common simple expressions (e.g. loading variable X or pointer P) as well as less-common complex expressions. With stack machines, in contrast, the results of a subexpression can be stored in one of two ways. The first way involves a temporary variable in memory. Storing and subsequent retrievals cost additional instructions and additional data cache cycles. Doing this is only a win if the subexpression computation costs more in time than fetching from memory, which in most stack CPUs, almost always is the case. It is never worthwhile for simple variables and pointer fetches, because those already have the same cost of one data cache cycle per access. It is only marginally worthwhile for expressions like X+1. These simpler expressions make up the majority of redundant, optimizable expressions in programs written in nonconcatenative languages. An optimizing compiler can only win on redundancies that the programmer could have avoided in the source code. The second way involves just leaving a computed value on the data stack, and duplicating it on an asneeded basis. This requires some amount of stack permutation, at the very least, an instruction to duplicate the results. This approach wins only if you can keep your data stack depth shallow enough for a "DUP", "ROT", or "OVER" type of instruction to gain access to the desired computed value. Some virtual machines support a general purpose permutation primitive, "PICK," which allows one to select arbitrarily any item in the data stack for duplication. Despite how limiting this approach sounds, hand-written stack code tends to make extensive use of this approach, resulting in [1 s1of1t2ware with runtime overheads comparable to those of general-purpose register-register architectures. ][ ] Unfortunately, algorithms for optimal "stack scheduling" of values aren't known to exist in general, making such stack optimizations difficult to impossible to automate for non-concatenative programming languages.

As a result, it is very common for compilers for stack machines to never bother applying code-factoring optimizations. It is too much trouble, despite the significant payoff. Rigid Code Order In modern machines, the time to fetch a variable from the data cache is often several times longer than the time needed for basic ALU operations. A program runs faster without stalls if its memory loads can be started several cycles before the instruction which needs that variable, while also working on independent instructions. Complex machines can do that with a deep pipeline and "out-of-order execution" that examines and runs many instructions at once. Register machines can[1a]lso do this with much simpler "inorder" hardware, a shallow pipeline, and slightly smarter compilers. 3 The load step becomes a separate instruction, and that instruction is statically scheduled much earlier in the code sequence. The compiler puts independent steps in between. This scheduling trick requires explicit, spare registers. It is not possible on stack machines without exposing some aspect of the micro-architecture to the programmer. For the expression A-B, the right operand must be evaluated and pushed immediately prior to the Minus step. Without stack permutation or hardware multithreading, relatively little useful code can be put in between while waiting for the Load B to finish. Stack machines can work around the memory delay by either having a deep out-of-order execution pipeline covering many instructions at once, or more likely, they can permute the stack such that they can work on other workloads while the load com ]pletes, or they can interlace the execution of different program threads, as in the Unisys A9 system. [14 Today's increasingly parallel computational loads suggests, however, this might not be the disadvantage it's been made out to be in the past. Hides a Faster Register Machine Inside Some simple stack machines have a chip design which is fully customized all the way down to the level of individual registers. The top of stack address register and the N top of stack data buffers are built from separate individual register circuits, with separate adders and ad hoc connections. However, most stack machines are built from larger circuit components where the N data buffers are stored together within a register file and share read/write buses. The decoded stack instructions are mapped into one or more sequential actions on that hidden register file. Loads and ALU ops act on a few topmost registers, and implicit spills and fills act on the bottommost registers. The decoder allows the instruction stream to be compact. But if the code stream instead had explicit register-select fields which directly manipulated the underlying register file, the compiler could make better use of all registers and the program would run faster. Microprogrammed stack machines are an example of this. The inner microcode engine is some kind of RISC-like register machine or a VLIW-like machine using multiple register files. When controlled directly by task-specific microcode, that engine gets much more work completed per cycle than when controlled indirectly by equivalent stack code for that same task. The object code translators for the HP 3000 and Tandem T/16 are another example. [15][16] They translated stack code sequences into equivalent sequences of RISC code. Minor 'local' optimizations removed much of the overhead of a stack architecture. Spare registers were used to factor out repeated address

calculations. The translated code still retained plenty of emulation overhead from the mismatch between original and target machines. Despite that burden, the cycle efficiency of the translated code matched the cycle efficiency of the original stack code. And when the source code was recompiled directly to the register machine via optimizing compilers, the efficiency doubled. This shows that the stack architecture and its non-optimizing compilers were wasting over half of the power of the underlying hardware. Register files are good tools for computing because they have high bandwidth and very low latency, compared to memory references via data caches. In a simple machine, the register file allows reading two independent registers and writing of a third, all in one ALU cycle with one-cycle or less latency. Whereas the corresponding data cache can start only one read or one write (not both) per cycle, and the read typically has a latency of two ALU cycles. That's one third of the throughput at twice the pipeline delay. In a complex machine like Athlon that completes two or more instructions per cycle, the register file allows reading of four or more independent registers and writing of two others, all in one ALU cycle with one-cycle latency. Whereas the corresponding dual-ported data cache can start only two reads or writes per cycle, with multiple cycles of latency. Again, that's one third of the throughput of registers. It is very expensive to build a cache with additional ports. More Instructions, Slower Interpreters Interprete17 [rs] for virtual stack machines are often slower than interpreters for other styles of virtual machine. This slowdown is worst when running on host machines with deep execution pipelines, such as current x86 chips. A program has to execute more instructions when compiled to a stack machine than when compiled to a register machine or memory-to-memory machine. Every variable load or constant requires its own separate Load instruction, instead of being bundled within the instruction which uses that value. The separated instructions may be simple and faster running, but the total instruction count is still higher. In some interpreters, the interpreter must execute a N-way switch jump to decode the next opcode and branch to its steps for that particular opcode. Another method for selecting opcodes is Threaded_code. The host machine's prefetch mechanisms are unable to predict and fetch the target of that indexed or indirect jump. So the host machine's execution pipeline must restart each time the hosted interpreter decodes another virtual i[1n8s]truction. This happens more often for virtual stack machines than for other styles of virtual machine. Android's Dalvik virtual machine for Java uses a virtual-register 16-bit instruction set instead of Java's usual 8-bit stack code, to minimize instruction count and opcode dispatch stalls. Arithmetic instructions directly fetch or store local variables via 4-bit (or larger) ins1t9r]uc0t]ion fields. Version 5.0 of Lua replaced its virtual stack machine with a faster virtual register machine. [ [2

Hybrid Machines
Pure stack machines are quite inefficient for procedures which access multiple fields from the same object. The stack machine code must reload the object pointer for each pointer+offset calculation. A common fix for this is to add some register-machine features to the stack machine: a visible register file

dedicated to holding addresses, and register-style instructions for doing loads and simple address

calculations. It is uncommon to have the registers be fully general purpose, because then there is no strong reason to have an expression stack and postfix instructions. Another common hybrid is to start with a register machine architecture, and add another memory address mode which emulates the push or pop operations of stack machines: 'memaddress = reg; reg += instr.displ'. This was first used in DEC's PDP-11 minicomputer. This feature was carried forward in VAX computers and in Motorola 6800 and M68000 microprocessors. This allowed the use of simpler stack methods in early compilers. It also efficiently supported virtual machines using stack interpreters or threaded code. However, this feature did not help the register machine's own code to become as compact as pure stack machine code. And the execution speed was less than when compiling well to the register architecture. It is faster to change the top-of-stack pointer only occasionally (once per call or return) rather than constantly stepping it up and down throughout each program statement. And even faster to avoid memory references entirely. More recently, so-called second-generation stack machines have adopted a dedicated collection of registers to serve as address registers, off-loading the task of memory addressing from the data stack. For example, MuP21 relies[2on a register called "A", while the more recent GreenArrays processors relies on two registers: A and B. 1] The Intel x86 family of microprocessors have a register-style instruction set for most operations, but use stack instructions for its oldest x87 form of floating point arithmetic.

(b) Assume a hypothetical machine that has only PC, AC, MAR, IR, DR and Flag registers. (You may assume the roles of these registers same as that are defined in general for a von Neumann machine) The instructions of this machine can take two operands - one the operand of these must be a register operand. It has an instruction: SUB AC, X; // it performs the operation AC AC - Content of location X Write and explain the sequence of micro-operations that are required to fetch and execute the instruction. Make and state suitable assumptions, if any. Solution Program counter or PC: The program counter, or PC (also called the instruction pointer to a seminal Intel instruction set, such as the 8080 or 4004, or instruction address register, or just part of the instruction sequencer in some computers) is a processor register that indicates where the computer is in its instruction sequence. Depending on the details of the particular computer, the PC holds either the address of the instruction being executed, or the address of the next instruction to be executed. In most processors, the program counter is incremented automatically after fetching a program instruction, so that instructions are normally retrieved sequentially from memory, with certain instructions, such as branches, jumps and subroutine calls and returns, interrupting the sequence by placing a new value in the program counter. Such jump instructions allow a new address to be chosen as the start of the next part of the flow of instructions from the memory. They allow new values to be loaded (written) into the program counter registerA subroutine return is then achieved by writing the saved value back in to the program counter again. Accumulator (AC): In a computer's central processing unit (CPU), an accumulator is a register in which intermediate arithmetic and logic results are stored. Without a register like an accumulator, it would be necessary to write the result of each calculation (addition, multiplication, shift, etc.) to main memory, perhaps only to be read right back again for use in the next operation. Access to main memory is slower than access to a register like the accumulator because the technology used for the large main memory is slower (but cheaper) than that used for a register. The canonical example for accumulator use is summing a list of numbers. The accumulator is initially set to zero, then each number in turn is added to the value in the accumulator. Only when all numbers have been added is the result held in the accumulator written to main memory or to another, non-accumulator, CPU register. An accumulator machine, also called a 1-operand machine, or a CPU with accumulator-based architecture, is a kind of CPU in whichalthough it may have several registersthe CPU always stores the results of most calculations in one special registertypically called "the" accumulator of that CPU. Historically almost all early computers were accumulator machines; and many microcontrollers still popular as of 2010 (such as the 68HC12, the PIC micro, the 8051 and several others) are basically accumulator machines.

Modern CPUs are typically 2-operand or 3-operand machinesthe additional operands specify which one of many general purpose registers (also called "general purpose accumulators"[1]) are used as the source and destination for calculations. These CPUs are not considered "accumulator machines". The characteristic which distinguishes one register as being the accumulator of a computer architecture is that the accumulator (if the architecture were to have one) would be used as an implicit operand for arithmetic instructions. For instance, a CPU might have an instruction like: ADD memaddress This instruction would add the value read from the memory location at memaddress to the value from the accumulator, placing the result in the accumulator. The accumulator is not identified in the instruction by a register number; it is implicit in the instruction and no other register can be specified in the instruction. Some architectures use a particular register as an accumulator in some instructions, but other instructions use register numbers for explicit operand specification.

MAR:
The Memory Address Register (MAR) is a CPU register that either stores the memory address from which data will be fetched to the CPU or the address to which data will be sent and stored. In other words, MAR holds the memory location of data that needs to be accessed. When reading from memory, data addressed by MAR is fed into the MDR (memory data register) and then used by the CPU. When writing to memory, the CPU writes data from MDR to the memory location whose address is stored in MAR. The Memory Address Register is half of a minimal interface between a microprogram and computer storage. The other half is a memory data register. Far more complex memory interfaces exist, but this is the least that can work. IR (Instruction register): Instruction register is the part of a CPU's control unit that stores the instruction currently being executed or decoded. In simple processors each instruction to be executed is loaded into the instruction register which holds it while it is decoded, prepared and ultimately executed, which can take several steps. More complicated processors use a pipeline of instruction registers where each stage of the pipeline does part of the decoding, preparation or execution and then passes it to the next stage for its step. Modern processors can even do some of the steps of out of order as decoding on several instructions is done in parallel. Decoding the opcode in the instruction register includes determining the instruction, determining where its operands are in memory, retrieving the operands from memory, allocating processor resources to execute the command (in superscalar processors), etc DR: Data registers are used to hold numeric values such as integer and floating-point values. In some older and low end CPUs, a special data register, known as the accumulator, is used implicitly for many operations. Flag Registers:

The FLAGS register is the status register in Intel x86 microprocessors that contains the current state of the processor. This register is 16-bits wide. Its successors, the EFLAGS and RFLAGS registers are 32-bits and 64-bits wide, respectively. The wider registers retain compatibility with their smaller predecessors

Steps for instruction execution Step 1: The first step of instruction execution is to fetch the instruction that is to be executed. To do so we require: Address of the "instruction to be fetched". Normally Program counter (PC) register stores this information. Now this address is converted to physical machine address and put on address bus with the help of a buffer register sometimes called Memory Address Register (MAR). This, coupled with a request from control unit for reading, fetches the instruction on the data bus, and transfers the instruction to Instruction Register (IR). On completion of fetch PC is incremented to point to the next instruction. In Step 2: The IR is decoded; let us assume that Instruction Register contains an instruction. ADD Memory location B with general purpose register R1 and store result in R1, then control unit will first instruct to: Get the data of memory location B to buffer register for data (DR) using buffer address register (MAR) by issuing Memory read operation. This data may be stored in a general purpose register, if so needed let us say R2

Now, ALU will perform addition of R1 & R2 under the command of control unit and the result will be put back in
R1. The status of ALU operation for example result in zero/non zero, overflow/no overflow etc. is recorded in the status register.

Similarly, the other instructions are fetched and executed using ALU and register under the control of the
Control Unit The number and the nature of registers is a key factor that differentiates among computers. For example, Intel Pentium has about 32 registers. Some of these registers are special registers and others are general-purpose registers. Some of the basic registers in a machine are:

All von-Neumann machines have a program counter (PC) (or instruction counter IC), which is a register that
contains the address of the next instruction to be executed.

Most computers use special registers to hold the instruction(s) currently being executed. They are called
instruction register (IR).

There are a number of general-purpose registers. With these three kinds of registers, a computer would be able to
execute programs.

Other types of registers: Memory-address register (MAR) holds the address of next memory operation (load or store). Memory-buffer register (MBR) holds the content of memory operation (load or store).

Processor status bits indicate the current status of the processor. Sometimes it is combined with the other
processor status bits and is called the program status word (PSW). A few factors to consider when choosing the number of registers in a CPU are:

CPU can access registers faster then it can access main memory. For addressing a register, depending on the number of addressable registers a few bit addresses is needed in an
instruction. These address bits are definetly quite less in comparison to a memory address. For example, for addressing 256 registers you just need 8 bits, whereas, the common memory size of 1MB requires 20 address bits, a difference of 60%.

Compilers tend to use a small number of registers because large numbers of registers are very difficult to use
effectively. A general good number of registers is 32 in a general machine.

Registers are more expensive than memory but far less in number. From a user's point of view the register set can
be classified under two basic categories. Programmer Visible Registers: These registers can be used by machine or assembly language programmers to minimize the references to main memory. Status Control and Registers: These registers cannot be used by the programmers but are used to control the CPU or the execution of a program.

(c) Assume that you have a machine as shown in section 3.2.2 of Block 3 having the micro-operations as given in Figure 10 on page 62 of Block 3. Consider that R1 and R2 both are 8 bit registers and contains 11110101 and 10101001 respectively. What will be the values of select inputs, carry-in input and result of operation if the following micro-operations are performed? (For each micro-operation you may assume the initial value of R1 and R2 as defined above) 1) Decrement R1 2) R1 Exclusive OR R2 3) Subtract R1 from R2 with borrow 4) Shift Right R2 Solution 1) Subtract R1 from R2 Ans: Number Input R2 Output R1 0 1 Carry Bit 0 Sign Bit 11110101 01011010 01001111

2) Increment R1 Ans: Number Input 00000001 Output 11110110 Carry Bit R1 Sign Bit 11110101

3) Shift Left R1 Ans: Number Input Output Carry Bit R1 1 Sign Bit 11110101 11101010

4) Add R1, R2 with an initial input carry bit as 1 Ans: Number Input R2 Output 1 R1 Carry Bit 1 01011010 01010000 Sign Bit 11110101

(d) Explain the functions performed by the Micro-programmed Control Unit with the help of diagram Control Units? Also explain the role of sequencing logic component of Control Unit. Solution: THE MICRO-PROGRAMMED CONTROL

An alternative to a hardwired control unit is a micro-programmed control unit, in which the logic of the control unit is specified by a micro-program. A micro-program is also called firmware (midway between the hardware and the software). It consists of: (a) One or more micro-operations to be executed; and (b) The information about the micro-instruction to be executed next. The general configuration of a micro-programmed control unit is demonstrated in The micro-instructions are stored in the control memory. The address register for the control memory contains the address of the next instruction that is to be read. The control memory Buffer Register receives the micro-instruction that has been read. A micro-instruction execution primarily involves the generation of desired control signals and signals used to determine the next micro-instruction to be executed. The sequencing logic section loads the control memory address register. It also issues a read command to control memory. The following functions are performed by the micro-programmed control unit: 1. The sequence logic unit specifies the address of the control memory word that is to be read, in the Address Register of the Control Memory. It also issues the READ signal. 2. The desired control memory word is read into control memory Buffer Register. 3. The content of the control memory buffer register is decoded to create control The Control Unit signals and next-address information for the sequencing logic unit. 4. The sequencing logic unit finds the address of the next control word on the basis of the next-address information from the decoder and the ALU flags. As we have discussed earlier, the execute cycle steps of micro-operations are different for all instructions in addition the addressing mode may be different. All such information generally is dependent on the opcode of the instruction Register (IR). Thus, IR input to Address Register for Control Memory is desirable. Thus, there exist a decoder from IR to Address Register for control memory. This decoder translates the opcode of the IR into a control memory address.

(e) What are the advantages of instruction pipeline? Explain with the help of a diagram for a 3 stage instruction pipeline having cycles IFD (Instruction Fetch and Decode), OF (Operand Fetch) and ES (Execute and store results). What can be the problems of such an instruction pipeline? Solution:

An instruction pipeline is a technique used in the design of computers to increase their instruction throughput (the number of instructions that can be executed in a unit of time). Pipelining does not reduce the time to complete an instruction, but increases the number of instructions that can be processed at once.

Each instruction is split into a sequence of dependent steps. The first step is always to fetch the instruction from memory; the final step is usually writing the results of the instruction to processor registers or to memory. Pipelining seeks to let the processor work on as many instructions as there are dependent steps, just as an assembly line builds many vehicles at once, rather than waiting until one vehicle has passed through the line before admitting the next one. As the goal of the assembly line is to keep each assembler productive at all times, pipelining seeks to use every portion of the processor busy with some instruction. Pipelining lets the computer's cycle time be the time of the slowest step, and ideally lets one instruction complete in every cycle. The term pipeline is an analogy that stems from the fact that each part of the processor is doing work, as there is fluid in each link of a pipeline.

Basic five-stage pipeline in a RISC machine (IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory access, WB = Register write back). In the fourth clock cycle (the green column), the earliest instruction is in MEM stage, and the latest instruction has not yet entered the pipeline.

(f) Assume that a RISC machine has 64 registers out of which 16 registers are reserved for the Global variables. Assuming that 8 of the registers are to be used for one function, explain how the remaining registers will be used as overlapped register windows. How will these registers be used for parameter passing for subroutine calls? Explain with the help of diagram. Solution: In general, the register storage is faster than the main memory and the cache. Also the register addressing uses much shorter addresses than the addresses for main memory and the cache. However, the numbers of registers in a machine are less as generally the same chip contains the ALU and control unit. Thus, a strategy is needed that will optimize the register use and, thus, allow the most frequently accessed operands to be kept in registers in order to minimize register-memory operations. Such optimisation can either be entrusted to an optimising complier, which requires techniques for program analysis; or we can follow some hardware related techniques. The hardware approach will require the use of more registers so that more variables can be held in registers for

longer periods of time. This technique is used in RISC machines. On the face of it the use of a large set of registers should lead to fewer memory accesses, however in general about 32 registers were considered optimum. Since most operand references are to local variables of a function in C they are the obvious choice for storing in registers. Some registers can also be used for global variables. However, the problem here is that the program follows function call - return so the local variables are related to most recent local function, in addition this call return expects saving the context of calling program and return address. This also requires parameter passing on call. On return, from a call the variables of the calling program must be restored and the results must be passed back to the calling program. RISC register file provides a support for such call- returns with the help of register windows. Register files are broken into multiple small sets of registers and assigned to a different function. A function call automatically changes each of these sets. The use from one fixed size window of registers to another, rather than saving registers in memory as done in CISC. Windows for adjacent procedures are overlapped. This feature allows parameter passing without moving the variables at all. The following figure tries to explain this concept: Assumptions: Register file contains 64 registers. Let them be called by register number 0 - 64. The diagram shows the use of registers: when there is call to function A (fA) which calls function B (fB) and function B calls function C (fC).

Registers Nos. 0-7 Global variables required by fA, fB, and fC Function A

Used for Function B Function C

8-9 10 - 15 (6 Registers)

Unused Used by parameters of fC that may be passed to next call Used for local variable of fC Used by parameters that were passed from fB -> fC Local variables of fB Parameters that were passed from fA to fB Local variable of fA Parameter passed to fA Temporary variables of function C

16 - 25 (10 Registers) 26 - 31 (6 Registers)

Local variables of function C Parameters of function C

Temporary variables of function B

32 - 41 (10 Registers) 42 - 47 (6 Registers)

Local variables of function B Temporary variables of function A Parameters of function B

48 - 57 (10 Registers) 58 - 64 (6 Registers)

Local variables of function A Parameters of function A

Question 4 (a) Write a program in 8086 assembly Language (with proper comments) to find if a given sub-string is prefix of a given string. For example, the sub-string "Assembly" is the prefix in the string "Assembly Language Programming." You may assume that the sub-string as well as the string is available in the memory. You may also assume that the end of the strings is the character $ . Make suitable assumptions, if any. Solution: DATA SEGMENT STR1 DB "ENTER FIRST STRING HERE ->$" STR2 DB "ENTER SECOND STRING HERE ->$" STR3 DB "CONCATED STRING :->$" STR11 DB "FIRST STRING : ->$" STR22 DB "SECOND STRING: ->$" INSTR1 DB 20 DUP("$") INSTR2 DB 20 DUP("$") N DB ? N1 DB ? NEWLINE DB 10,13,"$" DATAENDS CODESEGMENTASSUME DS:DATA, CS:CODE START: MOV AX,DATA MOV DS,AX LEA SI, INSTR1 LEA DI, INSTR2 ; GET STRING MOV AH,09H LEA DX,STR1 INT 21H MOV AH,0AH MOV DX,SI INT 21H MOV AH,09H LEA DX,NEWLINE INT 21H

MOV AH,09H LEA DX,STR2 Page 24 INT 21H MOV AH,0AH MOV DX,DI INT 21H MOV AH,09H LEA DX,NEWLINE INT 21H ; PRINT THE STRING MOV AH,09H LEA DX,STR11 INT 21H MOV AH,09H LEA DX,INSTR1+2 INT 21H MOV AH,09H LEA DX,NEWLINE INT 21H MOV AH,09H LEA DX,STR22 INT 21H MOV AH,09H LEA DX,INSTR2+2 INT 21H MOV AH,09H LEA DX,NEWLINE INT 21H ; CONCATINATION OF THE STRING LEA SI,INSTR1 LEA DI,INSTR2 MOV CX,00 INC SI L1:INC SI CMP BYTE PTR[SI],"$" JNE L1 ADD DI,2 Page 25 MOV BX,0 L2: MOV BL,BYTE PTR[DI] MOV BYTE PTR[SI],BL INC SI INC DI CMP BYTE PTR[DI],"$" JNE L2 L8:DEC SI CMP SI,2 JNE L8 MOV AH,09H LEA DX,NEWLINE INT 21H MOV AH,09H LEA DX,STR3 INT 21H MOV AH,09H LEA DX,NEWLINE INT 21H L6: MOV BL, BYTE PTR[SI] MOV AH, 02H MOV DL, BL INT 21H INC SI CMP BYTE PTR[SI],"$" JNE L6 ; MOV AH, 09H ; LEA DX, INSTR1+2 ; INT 21 MOV AH,4CH INT 21H Page 26 CODEENDSENDSTART

(b) Write a program in 8086 assembly language to convert a two digit packed BCD number into equivalent ASCII digits. Your program should print the two ASCII digits. You may assume that the BCD number is in the AL register. Solution:

: name "convert" ; this program uses a subroutine written in 8086 assembly language ; that can be used for converting a string of number ; max of 4 ascii digit) to equivalent packed bcd digits. ;

bcd is binary coded decimal. ; this program does no screen output. ; to see results click "vars". org 100h jmp start string db '1234' ; 4 ascii digits. packed_bcd dw ? ; two bytes (word) to store 4 digits. start: lea bx, string lea di, packed_bcd call pack_to_bcd_and_binary ret ; return to operating system. ; subroutine written in 8086 assembly language ; that can be used for converting a string of number ; (max of 4 ascii digit) to equivalent packed ; bcd digits. ; input parameters: ; bx - address of source string (4 ascii digits). ; output: ; di - must be set to address for packed bcd (2 bytes). pack_to_bcd_and_binary proc near pusha ; point to 2 upper digits of packed bcd: ; (assumed that we have 4 digits) add di, 1 ; loop only for 2 because every time we ; read 2 digits (2 x 2 = 4 digits) mov cx, 2 ; reset packed bcd: mov word ptr [di], 0 ; to convert a char (0..9) to digit we need ; to subtract 48 (30h) from its ascii code, ; or just clear the upper nibble of a byte. ; mask: 00001111b (0fh) next_digit: mov ax, [bx] ; read 2 digits. and ah, 00001111b and al, 00001111b ; 8086 and all other Intel's microprocessors store less ; significant byte at lower address. xchg al, ah ; move first digit to upper nibble: shl ah, 4 ; pack bcd: or ah, al ; store 2 digits: mov [di], ah ; next packed bcd: sub di, 1 ; next word (2 digits): add bx, 2 loop next_digit popa ret pack_to_bcd_and_binary endp

(c) Write a simple subroutine that receives one parameter value. The subroutine checks if the passed parameter value is 0 or otherwise. In case, the value is 0 then it prints FALSE, otherwise it prints TRUE. Make suitable assumptions, if any. Solution:

/* ****************************************************************************** A simple subroutine that accepts a parameter value and checks if the passed parameter value is Zero(0). If the value is ZERO the subroutine outputs a string "Divide Overflow" and terminates the execution, otherwise it allows the calling program to continue ********************************************************************************/ #include<stdio.h> #include<stdlib.h> int chk_input(int in) { if(in==0) { printf("Divide Overflow\n"); exit(1); /* terminate Program*/ } else { return(0); /* continue execution*/ }} void main() { int i=89,in,flg; printf("A Program to divide 89 by any number supplied\nPlease enter divisor:"); scanf("%d",&in); flg=chk_input(in); if(flg==0) { printf("%d divided by %d =%f\n",i,in,float(i/in)); }} /* End of Program

For More Ignou Solved Assignments Please Visit - www.ignousolvedassignments.com Connect on Facebook : http://www.facebook.com/pages/IgnouSolvedAssignmentscom/346544145433550 Subscribe and Get Solved Assignments Direct to your Inbox : http://feedburner.google.com/fb/a/mailverify?uri=ignousolvedassignments_com

Вам также может понравиться