Вы находитесь на странице: 1из 69

Hardware abstract models –

structures and logic networks


HARDWARE MODELING LANGUAGES
• Hardware description languages are primarily motivated by the need of
specifying circuits. Several HDLs exist, with different features and goals
• some evolved from programming languages, like AHPL (A Hardware
Programming Language), which was based upon APL(A Programming
Language) and VHDL (VHSIC Hardware Description Language), which was
derived from ADA
• Ada is a structured, statically typed, imperative, and object-oriented high-
level computer programming language, extended from Pascal and other
languages
• specific nature of hardware circuits, fairly different from the commonly
used software programming languages.
Circuit models, synthesis and optimization: a
simplified view
Difference between Hardware and Software
Hardware Language Software Language
Hardware circuits can execute operations with a wide Software programs are most commonly executed on
degree of concurrency uni-processors and hence operations
are serialized
It is closer to programming language for parallel
computers, hardware circuits entail some sutural
information
interface of a circuit with the environment may
require the definition of the input output ports, data
format for input output
HDLs must support both behavioral and structural
views, to be used efficiently for circuit specification
detailed timing of the operations is very important in On the other hand, the specific execution time frames
hardware, because of interface requirements of the operations in software programs are of less
concern
• Circuits can be modeled under different views, and consequently HDLs
• Architectural and logic level modeling, behaviorul or structural views
• Some
• languages support combined views, thus allowing a designer to specify
implementation details for desired parts
• Synthesis tools support the computer-aided transformation of behavioral
models into structural ones, at both the architectural and logic levels
• HDLs serve also the purpose of exchange formats among tools and
designers.
• Circuit models require validation by simulation or verification methods.
Synthesis methods use HDL models as a starting point. As a result, several
goals must be fulfilled by HDLs.
• The multiple goals of HDLs cannot be achieved by programming languages
applied to hardware specification. Standard programming languages have
been used
• for functional modeling of processors that can be validated by compiling
and executing the models.
• The enhancement-of the C programming language for simulation and
synthesis has led to new HDL languages, such as ESlM
• We use the term HDL model as the counterpart of program in software
Distinctive Features of Hardware
Languages
• A language can be characterized by its syntax, semuntics and
pragmatics.
• The syntax relates to the language structure and it can be specified by
a grammar.
• The semantics relates to the meaning of a language. The semantic
rules associate actions to the language fragments that satisfy the
syntax rules
• The pragmatics relate to the other aspects of the language, including
implementation issues
• Languages can be broadly classified as procedural and declarative
languages
• Procedural programs specify the desired action by describing a sequence of
steps whose order of execution matters
• declarative models express the problem to be solved by a set of
declarations without detailing a solution method
• Alternative classifications of languages exist. Languages with an imperative
semantics
• where there is an underlying dependence between the assignments and
the values that variables
• Languages with an applicative semantics are those based on function
invocation.
• Languages for hardware specification are classified on the basis of the
description view that they support (e.g., physical, structural pr behavioral)
• Most HDLs support both structural and behavioral views, because circuit
specifications often require both
• Structural Hardware Languages
• Models in structural languages describe an interconnection of components.
Hence their expressive power is similar to that of circuit schematics
• Hierarchy is often used to make the description modular and compact.
VHDL language, using its structural modeling
capability
Behavioral Hardware Languages
• consider behavioral modeling for circuits in increasing levels of complexity.
Combinational logic circuits can be described by a set of ports (inputs
&outputs) equations that relate variables to logic expressions.
• The declarative paradigm applies best to combinational circuits, which are
by definition memoryless
• These models differ from structural models in that there is not a one-to-
one correspondence between expressions and logic gates, because for
some expressions there may not exist a single gate implementing it.
• Procedural languages can be used to dcscribe comhinational logic
circuits.Most procedural hardware languages, except for the early ones,
allow for multiple assignments to variables.
structures and logic networks
• Structural representations can be modeled in terms of incidence
structures.
• An incidence structure consists of a set of modules, a set of nets and
an incidence relation among modules and nets
• A simple model for the structure is a hypergraph, where the vertices
correspond to the modules and the edges to the net
• The incidence relation is then represented by the corresponding
incidence matrix
• Note that a hypergraph is equivalent to a bipartite graph, where the
two sets of vertices correspond to the modules and nets.
• Consider the example of Figure
• The corresponding hypergraph and biparite graphs are shown in Figures 3.5 (b) and (c),
respectively. A module-oriented netlist is the following
• ml: nl,n2,n3
• m2: nl,n2
• m3: n2. n3
• Incidence structures can be made hierarchical in the following way. A leaf module is a
primitive with a set of pins. A non-leaf module is a set of modules, called its submodules,
a set of nets and an incidence structure relating the nets to the pins of the module itself
and to those of the submodules.
• module m2 has submodule
• which consists of submodules rn21 and 17122, subueu n21 and n22 and internal pins
• p21, p22, p23. p24 and p25.
Logic Networks
• A generalized logic network is a structure, where each leaf module is
associated with a
• combinational or sequential logic function. While this-concept is
general and powerful,
• consider here two restrictions to this model: the combinational logic
network and the synchronous logic network.
• The combinational logic network, called also logic network or
Boolean network, is a hierarchical structure where:
• Each leaf module is associated with a multiple-input, single-output
combinational logic function, called a local function.
• Pins are partitioned into two classes, called input and outputs. Pins
that do not belong to submodules are also partitioned into two
classes, called primary inputs and primary output.
• Each net has a distinguished terminal, called a source, and an
orientation from the source to the other terminals. The source of a
net can be either a primary input or a primary output of a module at
the inferior level.
• In general, a logic network is a hybrid structural/behavioral
representation, because the incidence structure provides a structure
while The logic functions denote the terminal behavior of the leaf
modules
• In most cases, logic networks are used to represent multiple-input
output logic functions in a structured way. Indeed, logic networks
have a corresponding unique input output combinational logic
function
State Diagrams
• The behavioral view of sequential circuits at the logic level can be expressed by
finite-state machine transition diagrams
• A set of primary input panems, X.
• A set of primary output patterns, Y.
• A set of states, S.
• A state transition function, S : X x S  S.
• An output function, A : X x S  t Y for Mealy models or A : S  Y for Moore
models. An initial state.
• The state transition table is a tabulation of the state transition and output
function. Its corresponding graph-based representation is the stare transition
diagram.
• The state transition diagram is a labeled directed multi-graph Gt (V, E),
• the vertex set V is in one-to-one correspondence with the state set S
and the directed
• edge set E is in one-to-one correspondence with the transitions
specified by Del
• In the Mealy model, such an edge is labeled byx/A(x, s;). In the
Moore model, that edge is labeled by x only
Data-flow and Sequencing Graphs
• We describe here a Mealy-type finite-state machine that acts as a synchronizer
• between two signals. The primary inputs are a and b, plus the reset signal r.
• There is one primary output o that is asserted when both a and b are
simultaneously true or
• when one is true and the other was true at some previous time. The finite-state
• machine has four states. A reset state So. A state memorizing that a was true
while b
• was false. called S1, and a similar one for b. called s2. Finally, a state
corresponding to
• both a and b being, or having been, true, called s3. The state transitions and
the output
• function are annotated on the diagram
• culling vertex of
• a diagram at the higher level in the hierarchy. Each transition to a calling vertex
is equivalent to a transition into the entry state of the corresponding finite-
state machine diagram. A transition into an exit state corresponds to return to
the calling vertex.
A hierarchical state transition diagram
• A hierarchical state transition diagram is shown in Figure 3.10. There
• are two levels in the diagram: the top level has three states, the other
four. A transition
• into calling state so, is equivalent to a transition to the envy state of
the lower level of
• the hierarchy, i.e., into s,o. Transitions into s,, correspond to a
transition back to .sol. In
• simple words, the dotted edges of the diagrams are traversed
immediately
COMPILATION AND BEHAVIORAL
OPTIMIZATION
• explain in this section bow circuit models, described by HDL programs, can
be transformed in the abstract models that will be used as a starting point
for synthesis
• Most hardware compilation techniques have analogues in software
compilation. Since hardware synthesis followed the development of
software compilers, many techniques were borrowed and adapted from
the rich field of compiler design
• Nevertheless, some behavioral optimization techniques are applicable only
to hardware synthesis
• A software compiler consists of a front end that transforms a program into
an intermediate form and a back end that translates the intermediate form
into the machine code for a given architecture
• The front end is language dependent, and the back end varies
according to the target machine. Most modem optimizing compilers
improve the intermediate form, so that the optimization is neither
language nor machine dependent
• Similarly, a hardware compiler can be seen as consisting of a front
end, an optimizer and a back end
• The back end is much more complex than a software compiler,
because of the requirements on timing and interface of the internal
operations.
• The back end exploits several techniques that go under the generic
names of architectural synthesis, logic synthesis and library binding
Compilation Techniques
• The front end of a compiler is responsible for lexical and syntax analysis,
parsing and creation of the intermediate form.
• A lexical analyzer is a component of a compiler that reads the source model
and produces as output a set of tokens that the parser then uses for syntax
analysis.
• A lexical analyzer may also perform ancillary tasks, such as stripping
comments and expanding macros.
• A parser receives a set of tokens. Its first task is to verify that they satisfy
the syntax rules of the language. The parser has knowledge of the grammar
of the language and it generates a set of parse trees
• A parse tree is a tree-like representation of the syntactic structure of a
language
Whereas the front ends of a compiler for software and hardware are very similar,
the subsequent steps may be fairly different. In particular, for hardware languages,
diverse strategies are used according to their semantics and intent
Optimization Techniques
• Behavioral optimization is a set of semantic-preserving
transformations that minimize the amount of information needed to
specify the partial order of tasks.
• No knowledge about the circuit implementation style is required at
this stage. The latitude of applying such optimization depends on the
freedom to rearrange the intermediate code.
• models that are highly constrained to adhere to a time schedule or to
an operator binding may benefit very Little from the following
techniques.
• Behavioral optimization can be implemented in different ways. It can
be applied directly to the parse trees, or during the generation of the
intermediate form, or even on the intermediate form itself,
• For the sake of explanation, we consider here these transformations
as applied to sequences of statements, i.e., as program-level
transformations
• Algorithms for behavioural optimization of HDL models can he
classified as data-flow and control-flow oriented.
DATA-FLOW-BASED TRANSFORMATIONS.
• Tree-height reduction. This
transformation applies to the
arithmetic expression trees and
strives to achieve the expression split
into two-operand expressions, so
that the parallelism available in
hardware can be exploited at best. It
can be seen as a local
transformation, applied to each
compound arithmetic statement
Constant and variable propagation.

• Constant propagation, also called constant folding, consists of


detecting constant operands and pre-computing the value of the
operation with that operand. Since the result may be again a
constant, the new constant can be propagated to those operations
that use it as input.
• Example Consider the following fragment:
• a = 0; b = a + I; c = 2 * b;.
• It can be replaced by a = 0: b = I: c = 2 : .
Dead code elimination.
• Dead code consists of all those operations that cannot be reached, or
whose result is never referenced elsewhere. Such operations are
detected by data-flow analysis and removed. Obvious cases are those
statements following a procedure return statement.
• Example 3.4.9.
• Consider the following fragment: a = x; b = x + I; c = 2 * x;. If
• variable a is not referenced in the subsequent code, the first
assignment can be removed.
Operator strength reduction.
• Operator strength reduction means reducing the cost of
implementing an operator by using a simpler one. Even though in
principle some notion of the hardware implementation is required,
very often general Considerations apply. For example, a multiplication
by 2 (or by a power of 2) can be replaced by a shift. Shifters are
always faster and smaller than multipliers in many implementations.
• Example 3.4.10.
• Consider the following fragment: a = x2; b = 3 * x;. It can be
• replacedbya=x*x: r=x<<I: b=x+t;.
CONTROL-FLOW-BASED TRANSFORMATIONS.
• The following transformations are typical of hardware compilers. In
some cases these transformations are automated, in others they are
user driven.
• Model expansion.
• Writing structured models by exploiting subroutines and functions is useful
for two main reasons: modularity and re-usability. Modularity helps in
highlighting a particular task (or set of tasks). Often, models are called only
once.
• Example 3.4.12. Consider the follow!ng fragment: x = a + b: y = o * b;
z = foo(x,y):, where foo(p,q){t = q - p; return(t); }. Then by expanding
foo, we have x=a+b; y=a*b; z=y-x:.
Conditional expansion.
• A conditional construct can be always transformed into a parallel
construct with a test in the end. Under some circumstances this
transformation can increase the performance of the circuit. For
example, this happens when the conditional clause depends on some
late-arriving signals
• Example 3.4.13. Consider the following fragment: ? = ab: if (a) {x = b +
d; } e l s e {x = bd; I. The conditional statement can be flattened to x =
a(b+d)+a'bd: and by some logic manipulation, the fragment can be
rewritten as y = ab; x = y +d(a+b);.
Loop expansion
• Loop expansion, or unrolling, applies to an iterative construct with
data-independent exit conditions. The loop is replaced by as many
instances of its body as the number of operations
• Example 3.4.14. Consider the following fragment: .
• x = 0: for (i = 1: i 5 3: i + +)
• {x =x+a[i];}. The loop can be flattened tox = 0: x =x+a[1]: x =x+a[2]: x =
• x + a[3]; and then transformed to .r = a[1] + a[2] + a[3] by propagation
Block-level transformations
• Branching and iterative constructs segment the intermediate code
into basic blocks. Such blocks correspond to the sequencing graph
entities.
• block-level transformations that include block merging and
expansions of conditionals and loops. Even though he did not
consider model expansion, the extension is straightforward
• Therefore, collapsing blocks may provide more parallelism and
enhance the average performance. To find the optimum number of
expansions to be performed, he proposed five transformations
UNIT -3
Architectural Synthesis
• Architectural synthesis means constructing the macroscopic structure of a digital
circuit,
• starting from behavioral models that can be captured by data-flow or sequencing
graphs.
• The outcome of architectural synthesis is both a structural view of the circuit,in
particular of its data path, and a logic-level specification of its control unit.
• Architectural synthesis may be performed in many ways, according to the desired
circuit implementation style. Therefore a large variety of problems, algorithms
and tools have been proposed that fall under the umbrella of architectural
synthesis.
• Circuit implementations are evaluated on the basis of the following objectives:
area, cycle-time (i.e., the clock period) and latency (i.e., the number of cycles to
perform all operations) as well as throughput (i.e., the computation rate)
CIRCUIT SPECIFICATIONS FOR
ARCHITECTURAL SYNTHESIS
• Specifications for architectural synthesis include behavioral-level
circuit models, details about the resources being used and
constraints. Behavioral models are captured by sequencing graphs
• Resources
• Functional resources process data. They implement arithmetic or logic
functions and can be grouped into two subclasses:
• Primitive resources are sub circuits that are designed carefully once and often used.
Examples are arithmetic units and some standard logic functions, such as encoders and
decoders. Primitive resources can be stored in libraries
• Application-specific resources are sub circuits that solve a particular subtask. An
example is a sub circuit servicing a particular interrupt of a processor. In general such
resources are the implementation of other HDL models.
• Memory resources store data. Examples are registers and read-only and
read-write memory arrays. Requirement for storage resources are implicit
in the sequencing graph model
• Interface resources support data transfer. Interface resources include
busses that may he used as a major means of communication inside a data
path. External interface resources are IO pads and interfacing circuits
• The major decisions in architectural synthesis are often related to the
usage of functional resources
• formulating architectural synthesis and optimization problems, there is no
difference between primitive and application-specific functional resources.
Both types can be characterized in terms of area and performance and
used as building blocks
Constraints
• Constraints in architectural synthesis can be classified into two major
groups: interface constraints and implementation constraints
• Interface constraints are additional specifications to ensure that the circuit
can be embedded in a given environment. They relate to the format and
timing of the I/O data transfers.
• The timing separation of I/O operations can be specified by timing
constraints that can ensure that a synchronous I/O operation
• Implementation constraints reflect the desire of the designer to achieve a
structure
• with some properties. Examples are area constraints and performance
constraints, e.g., cycle-time and/or latency bounds.
THE FUNDAMENTAL ARCHITECTURAL
SYNTHESIS PROBLEMS
• now the fundamental problems in architectural synthesis and optimization. We
assume that a circuit is specified by:
• A sequencing graph.
• A set of functional resources. fully characterized in terms of area and execution
delays.
• A set of constraints
• now that storage is implemented by registers and interconnections by wires.
Usage of internal memory arrays and busses
• we shall consider next non-hierarchical graphs with operations having bounded
and known execution delays and present then extensions to hierarchical models
and unbounded delays
• Sequencing graphs are polar and acyclic, the source and sink vertices being
labeled as Vo and Vn respectively, where n = nops + 1.
• Architectural synthesis and optimization consists of two stages.
• First, placing the operations in time and in space, i.e., determining the
time interval for their execution and their binding to resources.
• Second, determining the detailed interconnections of the data path
and the logic-level specifications of the control unit.
• We show now that the first stage is equivalent to annotating the
sequencing graph with additional information
The Temporal Domain: Scheduling
• We denote the execution delays of the operations by the set D = {di; i = 0,
1, . . . , n}
• We assume that the delay of the source and sink vertices is zero.
• We define the start time of an operation as the time at which the
operation starts its execution. The start times of the operations,
represented by the set T = (ti; i = 0, 1, . . . , n)
• Scheduling is the task of determining the start times, subject to the
precedence constraints specified by the sequencing graph
• latency of a scheduled sequencing graph is denoted by A, and it is the
difference between the start time of the sink and the start time of the
source, i.e., A = tn - to.
• A scheduled sequencing graph is a vertex-weighted sequencing
graph, where each vertex is labeled by its start time. A schedule may
have to satisfy timing and/or resource usage constraints. Different
scheduling algorithms have been proposed, addressing unconstrained
and constrained problems
All operations are assumed to have unit execution delay. A
scheduled
sequencing graph is shown in Figure 4.3. The start time of the
operations is summarized
by the following table. The latency of the schedule is A = tn - to = 5 - I
= 4.
Consider again the sequencing graph of Figure 4.2, where all operations
have unit execution delay. A schedule with a bound on the resource usage of one resource
per type is the following:
The scheduled sequencing graph is shown in Figure 4.4. The latency of the schedule is
A=tn-to=8-1=7.
Hierarchical Models
• hierarchical graphs are considered, the concepts of scheduling and
binding must be extended accordingly
• A hierarchical schedule can be defined by associating a start time to
each vertex in each graph entity. The start times are now relative to
that of the source vertex in the corresponding graph entity. The start
times of the link vertices denote the start times of the sources of the
linked graphs
• The latency computation of a hierarchical sequencing graph, with
bounded delay operations, can be performed by traversing the
hierarchy bottom up
• The executiondelay of a model call vertex is the latency of the
corresponding graph entity.
• The execution delay of a branching vertex is the maximum of the
latencies of the corresponding bodies.
• The execution delay of an iteration vertex is the latency of its body
times the maximum number of iterations.
• A hierarchical binding can he defined as the ensemble of bindings of
each graph entity, restricted to the operation vertices. Operations in
different entities may share resources. Whereas this can be beneficial
in improving the area and performance of the circuit,
The Synchronization Problem
• There are operations whose delay is unbounded and not known at
synthesis time. Examples are external synchronization and data-dependent
iteration
• Scheduling unbounded-latency sequencing graphs cannot be done with
traditional techniques.
• Different methods can be used. The simplest one is to modify the
sequencing graph by isolating the unbounded-delay operations
• by splitting thegraph into bounded-latency subgraphs
• these subgraphs can be scheduled. Techniques
• for isolating the synchronization points and implementing them in the
control unit
SCHEDULING ALGORITHMS
• Scheduling is a very important problem in architectural synthesis. Whereas
a sequencing graph prescribes only dependencies among the operations,
• the scheduling-of a sequencing graph determines the precise start time of
each task
• The start times must satisfy the original dependencies of the sequencing
graph, which limit the amount of parallelism of the operations
• Scheduling determines the concurrency of the resulting implementation,
and therefore 'it affects its performance. By the same token, the maximum
number of concurrent operations of any given type at any step of the
schedule is a lower bound
SCHEDULING ALGORITHMS
• Scheduling is a very important problem in architectural synthesis.
Whereas a sequencing graph prescribes only dependencies among
the operations, the scheduling-of a sequencing graph determines the
precise start time of each task
• Scheduling determines the concurrency of the resulting
implementation, and therefore 'it affects its performance. By the
same token, the maximum number
• concurrent operations of any given type at any step of the schedule is
a lower bound on the number of required hardware resources of that
type
A MODEL FOR THE SCHEDULING
PROBLEMS
• We recall that the sequencinz graph is a polar directed acyclic
• graph G,(V, E), where the vertex set V = {ui; i = 0, 1, . . . , n] is in one-
to-one correspondence
• with the set of operations and the edge set E = {(ui, u,); i, j = 0, 1, . . . ,
n]
• represents dependencies. We recall also that n =Nops+1 and that we
denote the source
• vertex by vo and the sink by vnboth are No-Operations. Let D = (di; i
= 0, I, . . . , n)
• be the set of operation execution delays;
Unit 5
Physical Design
FLOORPLANNING
• The input to floor planning is the output of
system partitioning and
• design entry—a netlist.
• As feature sizes decrease, both average
interconnect delay and average gate delay
decrease— but at different rates.
• because interconnect capacitance tends to
a limit that is independent of scaling
• Interconnect delay now dominates gate
delay.
Floorplanning Goals and Objectives
• Floorplanning is a mapping between the logical description (the
• netlist) and the physical description (the floorplan).
• Goals of floorplanning:
• • arrange the blocks on a chip,
• • decide the location of the I/O pads,
• • decide the location and number of the power pads,
• • decide the type of power distribution, and
• • decide the location and type of clock distribution
• Objectives of floorplanning are:
• • to minimize the chip area, and
• • minimize delay.
Measurement of Delay in Floorplanning
• We don’t yet know the parasitics of the interconnect capacitance
• know only the fanout (FO) of a net and the size of the block
• Fan-out is a term that defines the maximum number of digital inputs
that the output of a single logic gate can feed.
• We estimate interconnect length from predicted-capacitance tables
(wire-load tables)
Predicted capacitance.
Floorplanning Tools
• we start with a random floorplan generated by a floorplanning tool •
• flexible blocks and fixed blocks • seeding • seed cells • wildcard
symbol • hard seed • soft seed • seed connectors • rat's nest •
bundles • flight lines • congestion • aspect ratio • die cavity •
congestion map • routability • interconnect channels • channel
capacity • channel density
Channel Definition
• Key terms and concepts: channel definition or channel allocation •
channel ordering •
• slicing floorplan • cyclic constraint • switch box • merge • selective
flattening • routing
• order

Вам также может понравиться