Вы находитесь на странице: 1из 6

NETWORK ON CHIP (NoC)-DESIGN METHODOLOGIES AND TEST COMPEXITIES

T.Rajavenkatesan1, R.Srinivasan2
Assistant Professor, K.S.Rangasamy college of Technology, Tiruchengode.
urs_raja@ymail.com1,cnivasan1986@gmail.com2.
Abstract
Increasing complexity and the short life cycles of
embedded systems are pushing the current system-on-chip
design towards a rapid increasing on the number of
programmable processing units, while decreasing the gate
count for custom logic. Designers are facing new challenges
due to the complexity of the present multiprocessor systemon-chip technology. The large design space should include
many alternatives and explorations during the architecture
design and tuning the performances. The above bottleneck
can be minimized effectively by Network-on-chip (NoC).
The NoCs are proposed to address the communication
challenges present on system-on-chip (SoC) in the
nanoscale technologies. The Network-on-chip design
paradigm is paved the way to enabling the integration of an
exceedingly high number of computational and storage
blocks in a single chip. The success of NoC design paradigm
greatly depends on architectural design, standardization in
integrity between the cores, and their interconnection
fabric within the core. But its adoption and practical
implementation face important and unsolved issues related
to design methodologies, test strategies, and dedicated CAD
tools. Any methodology can be widely accepted only if it is
supported by efficient Test Access Mechanism (TAM). This
paper elucidates the overview in the design of Network-onchip, and their complexity in the design integrity and its
effective test scheduling methodologies.

Keywords: System-on-chip (SoC), Network-on-Chip


(NoC), thereal Design styles, Interconnection
complexities, Network interfacing, Testing.
1. Introduction
Todays SoCs contain multiple programmable
processor cores, hardware accelerators, and dedicated
peripherals. In addition, a growing amount of embedded
software runs on SoCs, causing both hardware and
software complexity to increase rapidly. Many SoCs are
built using a globally asynchronous, locally synchronous
(GALS) design style to support tens of different clock
domains and to facilitate layout, power management, and
interfacing to the outside world. Before silicon is
manufactured, a SoCs correctness must be confirmed
through formal verification and simulation. These
techniques provide confidence that no design errors were
introduced and that the resulting chip will behave
according to its specification.
However, the number of use cases verified must
be traded off with the amount of design detail that is, the
level of abstraction included in the verification.
Therefore, functional and electrical problems might be
undetected at this stage, because it is impossible to verify
all use cases at the detail level of a physical
implementation. For GALS SoCs in particular, verifying
a SoC designs behaviour for all combinations of clock
frequencies and phases is not feasible. Prototype silicon,

therefore, can still contain errors that manifest themselves


only in the product, outside the controlled test and
verification environment. Any remaining error must be
found and removed as quickly as possible in post silicon
validation and debug. Industry benchmarks show that, on
average, post silicon debug consumes more than 50% of
total project time. The debugging of SoC is quite
difficult. Because debugging a SoC involves three
nontrivial tasks: observing its state, obtaining a consistent
state, and directing the SoC to the erroneous trace and
state.
Observability
The first difficulty lies in the limited
observability that a SoC provides of what happens inside
it when it executes in its target environment and of why it
doesnt exhibit its specified behavior (i.e., the problems
root cause). Ideally, we would use simulator-like
functionality to inspect the state and operation of each
intellectual property (IP) block in the chip, in as much
detail as needed to analyze the erroneous behavior.
SoC implementations: the limited amount of
debug information that we can stream through the device
output pins in real time and the limited amount of on-chip
memory that we can dedicate to capturing debug
information without affecting system functionality or
adding too much to final product cost. One popular debug
approach that overcomes the Observability problem is the
so-called interactive (or run/stop) technique, which stops
an execution of the SoC before its state is inspected in
detail. An advantage of this technique is that we can
inspect the SoCs full state without running into the
device pins speed limitations. It also requires only a small
amount of additional debug logic in the SoC. The main
disadvantage of interactive debug is that because the SoC
must be stopped prior to observing its state, the technique
is intrusive.
2. Network-on-chip (NoC)
The future System-on-Chip (SoC) architectures
are predicted to become communication-bound.
Traditionally, System-on-Chips (SoCs) utilize topologies
based on shared buses. Dally and Towles [5] proposed
replacing dedicated, design specific wires with general
purpose, (packet-switched) network hence marking the
beginning of network-on-chip (NoC) era. The basic
properties of the NoC paradigm are
Separates communication from computation.
Avoids global, centralized controller for
communication.
Allows arbitrary number of terminals.
Has a topology that allows the addition of links as
the system size grows (offers scalability).

Does not utilize long, global wires spanning the


whole chip.
Customization (link width, buffer sizes, even
topology).
Allow multiple voltage and frequency domains.
Delivers data in-order either naturally or via
layered protocol.
Offers varying guarantees for transfers.
Offers support for system testing.
However, still there is no commonly-agreed
definition for minimum network configuration that is to
be used in a real NoC. Some authors [6, 7] consider
packet-switching as a key property of a NoC. There are,
however, quite a few circuit-switched approaches dubbed
as NoC as well. Naturally, certain properties, such as bit
error rate, energy, and area, have to be minimized as
noted by several authors.
2.1 Network on chip architecture
The
NoC
architecture
provides
the
communication infrastructure for the resources. We have
two main objectives. Firstly, it is possible to develop the
hardware of resources independently as stand-alone
blocks and create the NoC by connecting the blocks as
elements in the network. Secondly, the scalable and
configurable network is a flexible platform that can be
adapted to the needs of different workloads, while
maintaining the generality of application development
methods and practices.
In this section we first describe the prerequisites
for a design flow, independent of the NoC services. To be
able to generate an application-specific NoC, the NoC
must be modular, i.e. be constructed of simpler, re-usable
parameterised components: the router and network
interface (NI). At design time, these components must be
instantiated and connected in an appropriate topology.
Moreover, the IP ports must be connected to particular NI
ports (mapping). This result is a structural description
(hardware) of the NoC. The router and NI of the thereal
NoC have been documented in [8, 9].
The NoC as a whole is parameterised by the size
of the slot table, and by the operating speed (500MHz in
all examples, which is the speed of the router and NI
implementations). All instances of the thereal NoC are
(re)configurable at run time. This means that the NIs can
be (re)programmed at run-time, using standard memorymapped IO ports on the NoC, to support a variety of
connections [7]. (Routers are stateless and require no
configuration.) Within the NoCs hardware limits
(number of connections per port, slot table size, credit
counter bit widths, etc.) connections can be configured
with different (guaranteed) properties, such as throughput
and latency, by programming the path from master to
slave, the number of slots and flow-control credits, etc. A
configuration for a use case is a list of NoC memory
registers and their values. The thereal NoC offers both
best-effort and guaranteed services. The design flow
described in this paper can be used for any mix of
services. However, the advantage of NoCs with
guaranteed services, as discussed at length in the
introduction, is that they implement router and NI
arbitration schemes that allow analytical reasoning about

the performance of guaranteed connections independently


of the behaviour of other connections. This prerequisite is
essential for correct-by-construction NoC generation and
configuration, as well as compositional NoC performance
verification. The final prerequisite for a design flow is the
description
of
the
application
communication
requirements. It is not possible to generate a NoC without
knowing what the requirements of the application using it
will be. This will be described in successively in the
following section because this information is given as an
input to the design flow. The prerequisites are therefore: a
modular NoC offering guaranteed services with
parameterised components (router, NI), and a description
of the application requirements. The next section uses
these foundations to offer a NoC design flow.
2.2. NoC design flow
The starting point of the design flow is the
description of the applications communication
requirements an application consists of a number of task
graphs, or use cases. Each of these contains a number of
tasks, to be executed in hardware or software, using
storage, and communicating using the NoC. For the
design flow only the communication is relevant, i.e.
which ports on which IPs communicate with each other.
Figure.1 show the NoC design flow, which is fully
implemented.
Although a major motivation for NoCs is their
promise to improve back-end issues, such as global
timing closure, we omit details of the RTL synthesis and
back-end. First, however, note that NoC generation and
configurations are interdependent, and part of one
complex optimisation problem (find topology, mapping,
and throughput assignments that minimise the number of
routers, NIs, buffer sizes, and latencies). If this is done
correctly (by construction), no performance verification
and simulation is required (for guaranteed connections).
Simulation is still useful, e.g. to check if the
communication behaviour of IPs has been correctly
characterised.
With guaranteed services, this can be checked
independently for every connection. First, breaking the
design flow in smaller steps, simplifies steering or
overriding heuristics used in each of the individual tools,
enhancing user control. Second, it reduces the complexity
of optimisation problem, and simpler, faster heuristics
can be used. Higher level optimisation loops involving
multiple tools can then be easily added, such as the
smallest mesh loop. Third, parts of the flow can be more
easily customised, added, or replaced by the user to tailor
the flow or improve its performance. Finally, redundancy
in the sense of checking what should be generated
automatically and correct by construction, such as
simulation and performance verification (of guaranteed
connections),
minimises
impact
of
potential
programming errors, and acts as a safety net when
allowing the user to manually create or modify
intermediate results. The design flow is very simple for
the user. It is based on a make file with few targets
corresponding to the major activities like generate,
configure, verify, simulation.

Figure.1. thereal NoC design flow


2.3. Interconnect
techniques from networking and parallel processing can be
As a result of the increasing degree of integration borrowed and applied.
now possible in silicon and required for NoC, several
research groups are striving to develop efficient on-chip
communication infrastructures. There already exist many
SoC designs that contain multiple processors for
applications such as set-top boxes, wireless base stations,
HDTV, mobile handsets, and image processing. Following
on from this, new trends in the design of communication
architectures in multi-core SoCs have appeared in recent
research papers. Some suggest that multi-core SoCs can be
built around different but regular interconnect structures
Figure.2. NoC interconnect templates
that have their roots in architectures used for parallel
For NoC, the micro network must meet quality of
computing. Custom-built application specific interconnect service requirements (such as reliability, guaranteed
architectures are also promising.
bandwidth/latency), and deliver energy efficiency. And it
Such communication-centric interconnect fabrics must do this under the limitation of intrinsically unreliable
are characterized by different trade-offs with regards to signal transmission media. Such limitations are due to the
latency, throughput, reliability, energy dissipation, and increased likelihood of timing and data errors, the
silicon area requirements. The nature of the application variability of process parameters, crosstalk, and
will dictate the selection of a specific template for the environmental factors such as electro-magnetic
communication medium. To give a sense of the alternatives interference (EMI) and soft errors. To address these tasks,
now being tabled, Figure.2 below shows a representative current simulation methods and tools can be ported to
set of interconnect templates proposed by different networked SoCs to validate functionality and performance
research groups. A complex SoC can be viewed as a micro at various abstraction levels, ranging from the electrical to
network of multiple blocks, and hence, models and the transaction levels. NoC libraries including

switches/routers, links and interfaces will provide


designers with flexible components to complement
processor/storage cores. Nevertheless, the usefulness of
such libraries to designers will depend heavily on the
maturity of the corresponding synthesis/ optimization tools
and flows. Efficient micro network synthesis is needed to
enable NoC/SoC design much as logic synthesis enabled
efficient semicustom design in the 1980s.
Though the design process of NoC-based systems
borrows some of its aspects from the parallel computing
domain, it is driven by a significantly different set of
constraints. From the Performance perspective, high
throughput and low latency are desirable characteristics of
multi-processor SoC platforms. However, from a VLSI
design perspective, the energy dissipation profile of the
interconnect architectures is of prime importance as the
latter can represent a significant portion of the overall
energy budget. The silicon area overhead due to the
interconnect fabric is important too. The common
characteristic of these kinds of architectures is such that the
processor/storage cores communicate with each other
through high-performance links and intelligent switches
and that the communication design can be represented at a
high abstraction level.
The exchange of data among the processor/storage
cores is becoming an increasingly difficult task with
growing system size and nonscalable global wire delay. To
cope with these issues, the end-to-end communication
medium needs to be divided into multiple pipelined stages,
with delay in each stage comparable with the clock-cycle
budget. In NoC architectures, the inter-switch wire
segments together.
with the switch blocks constitute a highly-pipelined
communication medium characterized by link pipelining,
deeply-pipelined switches, and latency-insensitive
component design.
3. Network interfacing
The success of the NoC design paradigm then
relies greatly on the standardization of the interfaces
between intellectual property (IP) cores and the
interconnection fabric. Using a standard interface should
not impact the methodologies for IP core development. In
fact, IP cores wrapped with a standard interface will
exhibit a higher reusability and greatly simplify the task of
system integration.

Figure.3. IP core interconnect with network


The Open Core Protocol (OCP) is a plug and play
interface standard receiving wide acceptance. As shown in

figure.3 above, for a core having both master and slave


interfaces, the OCP-compliant signals of the functional IP
blocks are packetized by a second interface. The network
interface has two functions:
1. Injecting/absorbing the flits leaving/arriving at the
functional/storage blocks,
2. Packetizing/depacketizing the signals coming
from/reaching to OCP compatible cores in form of
messages/flits.
All OCP signals are unidirectional and
synchronous, simplifying core implementation, integration
and timing analysis. The OCP defines a point-to-point
interface between two communicating entities, such as the
IP core and the communication medium. One entity acts as
the master of the OCP instance, and the other as the slave.
OCP unifies all inter-core communications.
4. Test challenges in NoC
Many key challenges for testing NoC have been
identified already. These were based on the technology and
complexity
levels of the
NoCs at the
point of time,
when NoCs
were
comprised of
a
several

heterogeneous cores including logic and memories, and


few analog cores.

Figure.4. BIST Interconnect NoC testing


An embedded core is not manufacturing and tested as a
standalone component. The challenge here was to ensure
that the test engineer should know the basic knowledge
with all the interconnects of a cores and its communication
with the adjacent cores as well as its interconnections
within the core itself. In figure.4 given above with BIST
Interconnect NoC test set, which is able to test the core
internally and externally with high fault coverage. But the
new levels of design complexity and the nanometre
technology introduced a new set of NoC challenges. Most
of these new challenges do not have widely accepted
solutions yet. These challenges have a similar to that of
SoC testing problems like:
1. Testing a NoC-based system includes testing of
embedded cores and testing of the on-chip network.
2. Testing of embedded cores is similar to conventional
SoC testing.
3. Testing of on-chip network.

Testing of interconnects, switches/routers, input/output


ports, and other mechanism rest than cores.
5. Testing of NoC
To develop efficient infrastructure for the NoC
paradigm is a serious challenge. Specifically, the design of
specialized Test Access Mechanisms
(TAMs) for
distributing test vectors and novel Design for Testability
(DFT) schemes assume major importance. Moreover, in a
communication-centric design environment like that
provided by the NoCs, fault tolerance and reliability of the
data transmission medium are two major requirements in
safety-critical applications. In figure.5 shows testing the
functional/storage blocks and their network interfaces a
TAM is needed to transport the test data. It transports test
stimuli from a test pattern source to the core under test. It
also transmits test responses from the core under test to test
pattern sink.
The test strategy of NoC-based systems must
address three problems:
1. Testing of the functional/storage blocks and
corresponding network interfaces.
2. Testing of the interconnect infrastructure itself.
3. Testing of the integrated system.
There is a major advantage in using NoCs as
TAMs, specifically an ability to reuse the existing resource
and the availability of several parallel paths to transmit test
data to each core. Thereby, system test time can be reduced
through the extensive use of test parallelization (i.e. more
functional blocks can be tested in parallel as more test
paths are available).

Figure.5. Test ports and routing paths


The controllability/observability of NoC interconnects is
relatively reduced because they are deeply embedded and
spread across the chip. Pin-count limitations restrict the use
of I/O pins dedicated to test of the different components of
the data-transport medium. So, the NoC infrastructure
should be progressively used for testing its own
components in a recursive manner (i.e. the good, already
tested NoC components should be used to transport test
patterns to the untested elements).
This strategy minimizes the use of additional
mechanisms for transporting data to the NoC elements
under test, while allowing reduction of test time through
the use of parallel test paths and test data multicast. Testing
the functional/storage blocks and the interconnect
infrastructure separately is not sufficient to ensure
adequate test quality. The interaction between the
functional/storage cores and the communication fabric has
to undergo extensive functional testing. This functional
system testing should encompass testing of I/O functions
of each processing element and the data routing functions.

soft (cosmic) errors, crosstalk, process variations,


electromigration, and material aging.
In general, we can distinguish between transient
and permanent failures and the design of reliable SoCs
must encompass techniques that address both types. From
a reliability point of view, one of the advantages of
packetized communication is the possibility of
incorporating error control information into the transmitted
data stream. Effective error detection and correction
methods borrowed from fault-tolerant computing and
communications engineering can be applied to cope with
uncertainty in on-chip data transmission. Such methods
need to be evaluated and optimized in terms of area, delay
and power trade-offs. Permanent failures may be due to
material aging (e.g. oxide), electromigration and
mechanical/thermal stress.
Failures can incapacitate a processing/storage core
and/or a communication link. Different fault-tolerant
multiprocessor architectures and routing algorithms have
been proposed in the parallel processing domain. Some can
be adapted to the NoC domain, but their effectiveness
needs to be evaluated in terms of defect/error coverage
versus throughput, delay, energy dissipation and silicon
overhead.
7. Efficient Reuse of Network
One challenge in this reuse-based approach is that
the channel width is determined by the system
performance in design process and hence cannot be
optimized for test purpose. In the context of network reuse
in NoC test, the available TAM or channel width for
wrapper scan chain design is already determined by the
bandwidth requirements of cores in mission mode, not for
test mode.
In the below mentioned figure.6 showed the resue of onchip network used in both functional and test mode. For
example, if the channel width is predesigned to be 4, then
half of the channel wires will be idle during core test while
the core under test only has two scan chains.

(a) Test mode

6. Reliability
Many SoCs are used within embedded systems,
where reliability is a figure of merit. At the same time,
beyond the 65 nm node, transistor and wire failures are
more likely to happen due to a variety of effects, such as
(a) Functional mode

8. Summary
Several prototype NoCs have been designed and
analyzed in both industry and academia but only few have
been implemented on silicon. However, many challenging
research problems remain to be solved at all levels, from
the physical link level through the network level, and all
the way up to the system architecture and application
software.

9. References
[1] Y.Zorian, Test requirements for embedded core-based systems and
IEEE P 1500, proc. Intl test Conference ,IEEE CS press, los
atlos,1997, PP191-199.
[2] Y.Zorian, E.J. Mariniseen, S.Dey, Testing embedded-core-based
system chips, IEEE compuetr, Vol.32. NO.6 (June 1999), pp.52-60.
[3] Rohit kapur, CTL for test information of digital ICs, Klewer
Academic publsihers, 2003.
[4] IEEE 1500 standard for embedded core test, IEEE (c) 2005.
[5] W. J. Dally and B. Towles, Principles and practices of
interconnection networks. Morgan Kaufmann Publishers, 2004.
[6]R. Cardoso et al., Design space exploration on heterogeneous
network-on-chip, in ISCAS, May 2005, pp. 428431.
[7] K.-C. Chang, J.-S. Shen, and T.-F. Chen, Evaluation and design
trade-offs between circuit-switched and packet-switched NOCs for
application-specific SOCs, in DAC, Jul. 2006, pp. 143148.
[8] S G Pestana, E Rijpkema, A Radulescu, K Goossens, and O P
Gangwal, .Cost-performance trade-offs in networks on chip:A
simulation-based approach,. in Proceedings of Design, Automation and
Test in Europe Conference, Feb. 2004.
[9] XU, J., Wolf, W., Henkel, J., Chakradhar, s., d LV, T. 2004. A case
study in networks-on-chip design for embedded video. In Proceedings
of Design, Automation and Testing in Europe Conference (DATE).
IEEE, 770775.
[10] Sathe, S., Wiklund, D., and liu, D. 2003. Design of a switching
node (router) for on-chip networks. In Proceedings of the 5th
International Conference on ASIC. IEEE, 7578.
[11] J. Chan et al. NoC GEN: a template based reuse methodology for
networks on chip architecture. In Proc. Intl Conference on VLSI
Design, 2004

Вам также может понравиться