Академический Документы
Профессиональный Документы
Культура Документы
Data
PC Source PC Machine
language
Program
program
+
+
data
data
Lexical
Specification LEX Scanner
Syntax IRl
Specification YACC Parser
YACC: - The parse generated by TACC performs reduction according to this grammar. The
actions associated with a string specification are executed when a reduction is made
according to the specification according to the specification. An attribute is associated with
every no terminal symbol. The attribute can be given any user-designed structure.
Yacc assist in the next phase of the compiler. It creates a parser. Yacc is available as
utility on the unix system. A parser can be
created using Yacc in the following manner.
Parser.y containing a Yacc specification.
The unix system command translates the file
Parser.y into a ‘C’ program alled y.tab.c which is a
representation of parser written in ‘C’ language.
y.tab.c is run through ‘C’ compiler and procedure
object program a.out. a.out performs the
translation specified by original Yacc program.
Data structures: -
The data structures used in language processing can be classified on the basis of the
following criteria:-
1. Nature of a data structure- whether a linear or nonlinear data structure
2. Purpose of a data structure- whether a search data structure data structure.
3. Lifetime of a data structure- whether used during language processing or during target
program execution.
A linear data structure consists of a linear arrangement of element in the memory.
Advantage of linear data structure:-
Dynamic memory allocation is generally not allowed.
Sequential as well as Binary searching is allowed.
Data is stored and accessed on the bases of location name.
NOTES BY – KANAK KUMAR KANAK E-MAIL ID – kkkanak@sify.com
1
SYSTEM SOFTWARE
Example of linear data structures stack, etc.
Non –linear:-
→ Allocation of non –contiguous memory location.
→ Dynamic memory allocation is frequently & efficiently performed.
→ Sequential as well as binary search can be performed.
→ Data is stored and accessed on the basis of address i.e. using pointer.
→ Examples of non-linear data structure are linked list, tree, graph, etc.
Array: -Array is the features, that allows to represent multiple similar elements with single
name.
→ It reduces the complexity of a program.
→ It reduces the size of a program. Contiguous memory locations are essentially
required.
→ Array can be of single dimension or multidimensional.
→ Array is considered as static data structure. Because size of an array is specified in
advance, i.e. before program execution begins.
Linked list: -Linked list is basically collection nodes, Each node contains the
address/references of next node. Address of first node is assigned to the list pointer. A node
at least consists of two fields i.e. data field and pointer field. Pointer field of last node contains
doubly linked list.
→ Several operations can be performed with a linked list like creation of list, display the
items, insertion of items, etc.
→ Size of linked list can be dynamically in decreased or decreases (i.e. grow and shrink
→ A linked list can be several types.
Memory Management
Static partitioning
Dynamic partitioning.
Static Partitioning: -
→ User’s memory area is divided into fixed sized partitions, in advance.
→ Two mechanisms are used to select the appropriate free partition from available
partitions for requested process, i.e. first fit and best fit.
Dynamic partitioning: -
→ Partitions are not created in advance.
→ But partitions are created for requested process after request has been made.
Drawback – External fragmentation, complexion process
Tree: - Tree is considered as linear data structures. Nodes of a tree are organized in
hierarchical order, starting from root.
A tree can be of: -
→ Binary Tree
→ Binary Search tree
→ AVL Tree (Height Balance Tree)
→ B-Tree (M-way Search Tree)
Graph: - A Graph consists of vertices and edges. A graph can be implement through
adjacency matrix and adjacency linked list.
Graph can be of:
→ Un-directed Graph
→ Directed Graph
Hashing: - Hashing is the process to search the information in a file, directory on the basis of
given key value. In hashing, a function namely hash() is used which requires key value as
argument, to return the address of exact location.
Stack: – Stack is a type of linear data structure, which organizes the data on the basis of Last
in First out Mechanism. Means that push and POP operations are performed at one end i.e.
TOP of stack.
Scanning: - Scanning is the process in which legal string is scanned character by character
and grouting the similar character altogether for specific purpose .In other words, scanning is
the process of recognizing the lexical components in a source string.
Parsing: - Main purpose of the process parsing is to check the validity of a source string and
to determine its syntactic structure. For an invalid string, the parser issues diagnostic
massages. For a valid string it builds a parse tree.
A parse tree demonstrate the steps in parsing hence it is useful for understanding the
process of parsing.
Top down Parsing: - Top down parsing, according to a grammar, attempts to derive a string
matching through a source string using a sequence of derivations starting form the
distinguished symbol of grammar.
Implementing Top down parsing:-
Following features are needed to implement top –down parsing:-
1. SSM (source string Marker):- it points to the first unmatched symbol in the source string.
2. PMM (Prediction Making Mechanism): - This mechanism syntactically selects the R.H.S.
Alternatives of a production during predictions making.
3. Matching and backtracking mechanism: - This mechanism matches every term symbol
generated during a derivation with the source symbol pointed by SSM. Backtracking is
performed when matching fails.
Chapter – 5 [Assemblers]
ADVANTAGE OF A TRANSLATOR:
A translator allows a programmer to express his thought (algorithm) in a language
other than that of machine language, which is to execute the algorithm. The reason is that
machine language is very complex to understand and design a problem. Some of the benefits
to write a program other than the machine language, these are: -
Increase in the Programmers productivity
Machine Independence
The input to a translation program is expressed in a source language. a source
language might be assembly language or any high level language. The output of a translator
is a target language.
Source Target
Translator
Language Code
Types of Assembler: -
There are three types of translators:
Interpreters
Compiler
Assembler
ASSEMBLER: -
An assembler is a program that accepts assembly language program as input and
produces its machine language equivalent along with information for the loader.
For the simplest assembler, the target language is machine language and its source
language has instruction in one to one correspondence with those of machine language but
with symbolic name for both operators and operands.
ASSEMBLER and its related Program: -
The Assembler program contains three types of entities: - Absolute entities, Relative
entities and the Object Program.
Absolute Entities: - It includes operation code, numeric and string constant and fix
addresses.
Relative Entities: - It includes the address of instruction and working storage area.
The Object Program: - It includes identification of which address are relative which
symbols are defined externally and which internally defined symbols are expected to
be referenced externally. These external references are resolved for two or more
MACRO EXPANSION
The use of a macro name with a set of actual parameters is replaced by some code
generation from its body. This is called macro expansion. Two kinds or expansion can be
readily identified: -
NOTES BY – KANAK KUMAR KANAK E-MAIL ID – kkkanak@sify.com
1
SYSTEM SOFTWARE
(i) Lexical Expansion: -lexical expansion implies replacement of a character string by
another character string during program generation. Lexical expansion is typically
employed to replace occurrences of formal parameters by corresponding actual
parameters.
(ii) Semantic Expansion: - Semantic expansion implies generation of instructions tailored to
the requirements of a specific usage – for example, generation of type specific
instructions for manipulation of byte and word operands.
A macro call leads to macro expansion during macro expansion; the macro call statement
is replaced by a sequence of assembly statements.
Two key notions concerning macro expansion are : -
1. Expansion time control flow: - This determines the order in which model statements are
visited during macro expansion.
2. Lexical substitution: - Lexical substitution is used to generate an assembly statement
from model statements.
Macro Expansion: -
Macro expansion process is some how similar to language translation. The source program
containing macro definition and calls is translated into an assembly language program without any
macro definition or call. This program is handed over to assembler to obtain the target languages. The
translator which performs Macro expansion in this manner is called a Macro pre-processor.
Source
Macro pre-processor Assembler Target Code
Program
Issues Related to the Design of a Macro Pre-processor
As for as design issues of Macro Pre-processor is concerned that the definition and use
of macros in an assembly language program. For generating a statement during expansion,
we need to develop a simple scheme for substituting the appearance of formal parameter
with its value. Correspondence between a formal parameter and its value will have to be
established for this purpose.
MACRO DEFINITION AND CALL
Macro Definition: - A macro definition is enclosed between a macro header statement
and a macro end statement. Macro definitions are typically located at the starts of a
program. A macro definition consists of: -
(i) A macro prototype statement
(ii) One or more model statement
(iii)Macro preprocessor statements
The macro prototype statements declare the name of a macro and the names and
kinds of its parameters.
A model statement is a statement from which and assembly language statement
may b generated during macro expansion.
A preprocessor statement is used to perform auxiliary functions during macro
expansion.
The macro prototype statement has the following syntax: -
<macro name> [<formal parameter spec>[,..]] where <macro name> appears
in the mnemonic field of an assembly statement and <formal parameter spec> is of
the form, & <parameter name> [<parameter kind>].
Macro Call: - A macro is called by writing the macro name in the mnemonic field of an
assembly statement. The macro call has the syntax: -
<macro name>[<actual parameter spec>[,..]] where an actual parameter typically
resembles an operand specification in an assembly statement.
Nested Macro Calls: - A model statement is a macro may constitute a call on another
macro. Such calls are known as nested macro calls. We refer to the macro containing the
nested call as the outer macro and the called macro as the inner macro. Expansion of
nested macro calls follows the last-in-first-out (LIFO) rule. Thus, in a structure of nested
macro calls, expansion of the latest macro calls call (i.e. the innermost macro call in the
structure) is completed first.
ADVANCED MACRO FACILITIES
Advanced macro facilities are aimed at supporting semantic expansion. These facilities
can be grouped into: -
(i) Facilities for alteration of flow of control during expansion: - Two features are
provided to facilitate alteration of flow of control during expansion: -
(a) Expansion time sequencing symbols
(b) Expansion time statements AIF, AGO and ANOP.
A sequencing symbol (SS) has the syntax : <ordinary string>
A SS is defined by putting it in the label field of a statement in the macro body. It is used as
an operand in an AFO or AGO statements to designate the destination of an expansion time
control transfer. An AIF statements has the syntax AIF (<expression>) <sequencing symbol>
where <expression> is a relational expression involving ordinary strings, formal parameters
and their attributes and expansion time variables. If the relational expression evaluates to
true, expansion time control is transferred to the statement containing <sequence symbol> in
its lavel field. An AGO statement has the syntax: -
AGO <sequencing symbol> and unconditionally transfers expansion. Time control to
the statement containing <sequencing symbol> in its label field. ANOP and simply has the
effect of defining the sequencing symbol.
(ii) Expansion Time Variables: - Expansion time variables (EV’s) are variables which can
only be used during the expansion of macro calls. A local EV is created for use only during
a particular macro call. A global EV exists across all macro calls situated in a program and
can be use in any macro which has a declaration for it. Local and global EV’s are created
through declaration statements with the following syntax: -
LCL <EV specification> has the syntax & <EV name>, where <EV name> is an ordinary
string.
Values of Ev’s can be manipulated through the preprocessor statement SET. A SET
statement is written as:
<EV specification> SET <SET-expansion> where <EV specification> appears in the
label field and SET in the mnemonic field. A SET statement assigns the value of <SET-
expansion> to the EV specified in <EV specification>. The value of EV can be used in any
field of a model statement and in the expression of an AIF statement.
(iii) Attributes of formal parameters: - An attribute is written using the syntax:
Macro Assemble
Preprocessor r
Program with
Macro Definitions Target program
and calls
Program without
Macros
Figure: - A schematic of a macro preprocessor.
Design Overview: -
We begin the design by listing all tasks involved in macro expansion:-
i) Identify macro calls in the program
ii) Determine the values of formal parameters.
iii) Maintain the values of expansion time variables declared in a macro.
iv) Organize expansion time control flow
v) Determine the values of sequencing symbols
vi) Perform expansion of a model statement.
Identify Macro calls: - A table called the macro name table (MNT) is designed to hold the
names of all macro name is entered in this table when a macro definitions is processed.
Determine values of formal parameters: - A table called the actual parameter table (APT) is
designed to hold the values of formal parameters during the expansion of a macro call.
Each entry in the table is a pair, (<formal parameter name>, <values>) Two items of
information are needed to construct this table, names of formal parameter and default
values of keyword parameters.
Maintain the value of expansion time variables: - An expansion time variables table (EVT)
is maintained for this purpose. The table contains pairs of the form,
(<EV name>, <values> ) The value fields of a pair is accessed when a preprocessor
statement or a model statements under expansion refers to an EV.
Organize expansion time control flow: - The body of a macro, i.e. the set of preprocessor
statements and model statements in it, is stored in a table called the macro definition
table (MDT) for use during macro expansion.
Determine values of sequencing symbols: - A sequencing symbol table (SST) is maintained
to hold this information. The table contains pairs of the form;-
(<sequencing symbol name>, <MDT entry#>) where <MDT entry#> is the number of the
MDT entry which contains the model statement defining the sequencing symbol.
ASPECT OF COMPILATION
Two aspects of compilation are:
Generate code to implement meaning of a source program.
Provide diagnostics for violation of program language Symantec in a source program.
MEMORY ALLOCATION
Memory allocation involves three important tasks: -
(i) Determine the amount of memory required to represent the value of a data item.
(ii) Use an appropriate memory allocation model to implement the lifelines and scope of
data items.
(iii) Determine appropriate memory mapping to access the values in a non scalar data item
e.g. values in an array.
Types of Memory Allocation: - Typically memory allocations are two types:
(i) Static Memory allocation
(ii) Dynamic Memory allocation
Memory binding: - A memory binding is an association between the memory address’
attribute of data item and the address of memory area.
Memory allocation is the procedure used to perform memory binding.
Memory binding can be dynamic or static in nature, giving rise to the static and dynamic
models.
In static memory allocation, memory is allocation to a variable before the execution of a
program begins .
Static memory allocation is typically performed during compilation.
No memory allocation or deallocation actions are performed during the execution of a
program.
In dynamic memory allocation, memory bindings are established and destroyed during the
execution of a program. Typical examples of the use of these memory allocation models
are Fortran for static allocation and block structured language like PL/L Pascal, Ada etc. for
dynamic allocation
Dynamic memory allocation has two flavors: -
a. Automatic allocation
b. Program controlled allocation.
The former implies memory bindings performed at execution in it time of a program until,
while the latter implies memory bindings performed during the execution of a program
unit.
In automatic dynamic allocation, memory is allocated to the variables declared in a
program unit is entered during execution and is deallocated when the program unit is
exited. Thus the same memory area may be used for the variable of different program
units. It is also possible that different memory area may be allocated to the same variable
in different activations of a program unit.
In program controlled dynamic allocation, a program can allocate or deallocate memory at
arbitrary points during its execution.
It is obvious that in both automatic and program controlled allocation address of the
memory area allocated to a program unit cannot be determined at compilation time.
Dynamic memory allocation is implemented using stacks and heaps, thus necessating
pointer based access to variables.
Automatic dynamic allocation is implemented using stack since entry and exit from
programs units is LIFO in nature.
Program controlled dynamic allocation is implemented using a heap. A pointer is now
needed to point to each allocated memory area.
CODE OPTIMIZATION
Main purpose of code optimization is to improve the efficiency of program execution. It
can be achieved in two ways. That is:
(i) Redundancies in a program are eliminated.
(ii) Computations in a program are rearranged to make it execution efficiency.
Two points concerning the scope of optimization should also be considered. That is:
Optimization seeks to improve a program rather than the algorithm of the program.
Efficient code generation for a target machine.
INTERPRETERS
The another type of software which also does the translation is called an Interpreter.
The compiler and Interpreter have different approaches to translation. Interpreter translates
the program line by line. Each time the program is executed, Interpreter have checked every
line for syntax error and then converted to equivalent machine code. Interpreter does fast
debugging but its execution time is more respect to the compiler.
Compiler Vs Interpreter
Compiler Interpreter
1. Scans the entire program first and then Translates the program line by line.
translates it into machine code.
2. Converts the entire program to machine Each time the program executed, every line is
code; when all the syntax errors are removed checked for syntax error and then converted
execution takes place to equivalent machine ode
3. Slow for debugging Good for fast debugging
4. Execution time is less Execution time is more
5. Associated with object code Not associated with object code
6. (a) If no error, generate object/executable If error encounters in a particular line,
code program execution temporarily suspend and
(b) If error, displays listing of errors control is transferred at that line for debug.
After debugging, program execution
continues.
Chapter – 8 [Linkers]
LINKER
The need for linking a program with other programs arises because a program written
by a programmer is rarely of a stand alone nature. That is a program generally cannot
execute on its own, without requiring the presence of some other program in the computer
memory. For example consider a program written in high level language like ‘C’. Such a
program may contain calls on certain input output functions like printf() and scanf(). Which
are not written by the programmer himself. During program execution those standard
programs must reside in the memory.
The linking function makes address f program known to each other.
NOTES BY – KANAK KUMAR KANAK E-MAIL ID – kkkanak@sify.com
1
SYSTEM SOFTWARE
RELOCATION
Another function i.e. commonly perform by a loader is that relocation. This function can
be explained as follows:
Suppose a program written in ‘C’ calls standard function printf(). Assume that the
program which calls standard function printf() has name ‘A’. ‘A’ and printf() would have to be
linked with each other but where is main storage cell reload ‘A’ and printf(). A possible
solution would be to load then according to the address assigned when they were translated.
It should be noted that relocation is a simply moving a program from one area to another in
the storage. It refers to adjustment of address field. The part of a loader which performs
relocation is called relocating loader.
LOADER
Loader is a program which accepts an object de and prepares them for execution. An
object code produced by an assembler/compiler cannot be executed without any modification.
As many as four more functions must be performed. These functions are performed by the
loader. These functions are:-
(i) Allocation of space in main memory for the programs.
(ii) Linking of a program with each other like library function.
(iii)Adjust all address dependent allocation called Relocation.
(iv) Physically load the machine instruction and data into memory.
LOADER SCHEMES
There are several schemes accomplishing the fourth loading functions. These schemes
are:
(i) Absolute Loader: - Absolute Loader simply accepts the machine language code produce by
the assembler and places it into main memory.
(ii) Relocating Loader: - To avoid possible reassembling of all functions when a single sub-
routine is changed and to perform the tasks allocation and linking for the programmer. The
general class of relocating loader was introducing. The output of a relocating loader is the
object program and information about all other program it references.
(iii) Direct Linking Loader: - It is general relocatable loader and is perhaps the most popular
loading scheme presently used. It has the advantage of allowing the programmer multiple
procedure segments and multiple data segments and of giving him complete freedom in
referencing data or instructions contend in other segments.
(iv) Dynamic Loader: - In each of the previous loader schemes we have assume that all of the
sub-routines needed and loaded into main memory of the same time. If the total amount
of memory required by all these sub-routines exceed the amount available then an error
message is displayed. There are several hardware techniques such as paging and
segmentation, that attempts to solve the problem. Here such problem can be solved by
dynamic loading schemes in following manner.
Usually the sub-routines of a program are needed at different times. So it is possible
to produce an overlay structure that identifies mutually exclusive sub-routines.
Software Tools - Computing involves tow main activities program development and use of
application software. Language processors and operating system pay an obvious role in these
activities. A less obvious but vital role is played by program that help in developing and using
other programs. These programs, called software tools, perform various housekeeping tasks
involved in program development and application usage.
Software tools is a system program which:
Interface a program with the entity generating its input data or
Interfaces the results of a program with the entity consuming them.
Figure shows a schematic of a software tool: