Вы находитесь на странице: 1из 92

PRESENTATION ON WNOC

Outline
Introduction
Multi-core & Network-on-Chip paradigm
Performance limitations of conventional planar NoCs
Some alternatives
Possibility of designing wireless NoC (WiNoC)
CNT-based antennas
Possible communication schemes
Advantages and challenges of WiNoCs
Summary and road ahead
Multi-core applications
Nokia Sparrow
Intel LARRABEE
The network-on-chip paradigm
Driven by
Increased levels of
integration
Complexity of large
SoCs
New designs
counting 100s of
embedded cores
Need for platform-
based design
methodologies
DSM constraints
(power, delay, time-
to-market, etc)
Decoupling of functionality from communication
Dedicated infrastructure for data transport
NoC features
NoC infrastructure
switch link
NoC limitations
Predominantly multi-hop communication
High Latency and energy dissipation
Use of Express Virtual Channels
Core 1
Core 2
- IP core
- NoC interface
- NoC switch
Lower Latency
and Energy
Dissipation
Three Dimensional
Integration
Optical Interconnects
Wireless/RF
Interconnects
Novel interconnect paradigms for Multicore designs
3D NoC
Brett S. Feero, Partha Pratim Pande, Networks-On-Chip in a Three Dimensional Environment:
A Performance Evaluation, IEEE Transactions on Computers (TC), vol.58, no. 1, pp. 32-45,
January 2009.
Stacking multiple active layers
Manufacturability
Mismatch between various
layers
Yield is currently quite low
Temperature concerns
Despite power advantages,
reduced footprint increases
power density
Photonic NoC
High bandwidth photonic links for high
payload transfers
Limitations on switch architecture
More than 4-port designs are complex
On-chip integration of photonic components
A. Shacham et al., Photonic Network-on-Chip for Future Generations of Chip Multi-
Processors, IEEE Transactions on Computers, Vol. 57, issue 9, pp. 1246-1260.
On-Chip RF/Wireless Interconnects
Replace long
distance wires
Use of waveguides
out of package or
IC structures like
parallel metal wires
Chang et al.
demonstrated
Transmission Line
based RF
interconnect for on
chip
communication
Not really wireless
RF NoC
Bank of high frequency
oscillators and filters
FDM
On-Chip transmission
line acting as data
freeways
Routing of long
transmission lines
without eliminating any
existing links
M. F. Chang et al. CMP Network-on-Chip Overlaid With Multi-Band RF-Interconnect, Proc. of
IEEE International Symposium on High-Performance Computer Architecture, 16-20 February,
2008, pp. 191-202.
Wireless Network-on-Chip (WiNoC)
! Among several options, some may be possible
without a revolutionary technology
! Use of on-chip wireless links
" High bandwidth
" Speed of light
" Long distance
! Reduce latency and energy dissipation in communication
between distant nodes
Early example of on-chip wireless
interconnects
First utilized for distribution of clock signal
Technology: 0.18 um CMOS
Operating frequency: 15 GHz
Single Tone
Modulation and Channelization is not of any concern
~ 2 mm
B. A. Floyd, H. Chih-Ming, and K. K. O, IEEE Journal of Solid-State Circuits, vol. 37, pp. 543-552, 2002.
Propagation mechanisms of radio waves
over
intra-chip channels
Characterization of on-chip radio communications
Monopole Antennas
Measurement is done for the frequency range of 10 100 GHz
Y. P. Zhang et al., Propagation Mechanisms of Radio Waves Over Intra-Chip Channels with Integrated Antennas:
Frequency-Domain Measurements and Time-Domain Analysis, IEEE Transactions on Antennas and Propagation, Vol.
55, No. 10, October 2007, pp. 2900-2906.
CNT antennas
To make the antennas small, we
need small wavelengths # light
(IR, visible, UV)
MWCNT as Optical Antennae
Directional radiation
characteristics are in an excellent
and quantitative agreement with
conventional radio antenna theory
and simulations
K. Kempa, et al., "Carbon Nanotubes as Optical Antennae," Advanced Materials, vol. 19, 2007, pp.
421-426
G. Y. Slepyan, et al. , "Theory of optical scattering by achiral carbon nanotubes and their potential as
optical nanoantennas," Physical Review B (Condensed Matter and Materials Physics), vol. 73, pp.
195416-11, 2006
CNT bundle dipole antennas
SWCNT bundle dipole antennas
The efficiency of a bundle antenna can be 3040 dB higher
than that of a single SWCNT dipole antenna
Y. Huang et al., Performance Prediction of Carbon Nanotube Bundle Dipole Antennas, IEEE
Transactions on Nanotechnology, Vol. 7, No. 3, May 2008, pp. 331-337
Why nanotubes for antenna
application?
Already made by nature! How else would we want to make such small
structures?
Ballistic transport and quantum conductance # low resistive loss
Smooth, defect-free, stable and chemically complete structure # no power
loss due to defects or edge and surface roughness
Structural strength and high conductivity # high current carrying capacity (10
9

A/cm
2
)
Light absorption and generation in
nanotubes
J. A. Misewich, R. Martel, P. Avouris, J. C. Tsang, S. Heinze, and J. Tersoff, Science, vol. 300, pp. 783-786, 2003.
M. Freitag, V. Perebeinos, J. Chen, A. Stein, J. C. Tsang, J. A. Misewich, R. Martel, and P. Avouris, Nano Letters, vol. 4, pp.
1063-1066, 2004.
The CNT is expected to be a linearly polarized dipole radiation source
Conceptual transmitter and receiver
structures
About 10 different
frequency channels are
available.
There is a strong
polarization dependence.
Modulation and demodulation are performed by the antenna itself!
Courtesy: Alireza Nojeh, University of British Columbia
Hybrid Wired/Wireless NoC (WiNoC)
On-chip wireless nodes have
associated overhead
Hybrid architecture
Divide the whole NoC into multiple
subnets
Communication within the subnets is
still through traditional wires
Utilize wireless links for inter-subnet
data exchange
Each subnet will have a wireless base
station (WB)
Subnet architectures may vary and
even be heterogeneous on the same
chip
Network optimization
Limited Wireless Resources
Wireless part of the network should be simple
Position of the WB within the subnet is important
Connectivity in the wireless part of the network
Avoid multi-hop communication in the wireless channels
" Take advantage of speed of light data transfer
Point-to-point wireless links
Adopt small-world network features
" Enable easy scalability for larger system sizes
Minimize the overhead
Avoid complicated MAC protocols
Connecting the subnets
regular lattice
small-world
random graph
Courtesy: Christof Teuscher, Portland State University
Small-World Nets: The Watts-Strogatz Model
Establish high speed long distance links among
distant blocks on the chip
Communication mechanisms with CNT
antennas
Use multiband lasers to excite the antennas
Electroluminescence phenomenon will eliminate this overhead
Laser sources of different frequencies
Establishes a form of FDM
Optical modulators/demodulators
Different frequency channels can be assigned to pairs of
communicating subnets
Antenna elements tuned to different frequencies for each pair
WiNoC with small-world connections
Wireless port
Overall channelization scheme
32-bit flit width
4 distinct frequency
channels
Combination of FDM
and TDM.
Simple on-off keying
Establishing wireless links
Throughput and Latency
WiNoC is capable of improving performance of wireline architectures
Throughput
Latency
Scaling trend
Throughput degrades more if the subnet size is increased rather than
increasing the number of subnets
Summary
WiNoCs are promising alternatives to conventional planar
on-chip networks
Capable of improving NoC performance significantly
CNTs demonstrate interesting optical antenna properties
WiNoCs designed with CNT antennas will have low
overhead.
Road ahead
Overall network design
Development of scalable wireless network
Network optimization
Partitioning of wireless and wired network
Reliability of the wireless channel
Novel ECC schemes
CNT antennas are promising
But some unknowns!
Explore the possibility of NoC with mm-wave
wireless links
Acknowledgements
Dr. Benjamin Belzer, WSU
Dr. Deuk Heo, WSU
Dr. Christof Teuscher, PSU
Dr. Alireza Nojeh, UBC
Mr. Amlan Ganguly
Mr. Kevin Chang, WSU
Mr. Sujay Deb, WSU
Survey of Wireless Network-on-Chip Systems
by
Xi Li
A report submitted to the Graduate Faculty of
Auburn University
in partial fulllment of the
requirements for the Degree of
Master of Electrical Engineering
Auburn, Alabama
May 10, 2012
Keywords: wireless, NoC
Copyright 2012 by Xi Li
Approved by
Vishwani Agrawal, Chair, James J. Danaher Professor of Electrical and Computer
Engineering
Shiwen Mao, Associate Professor of Electrical and Computer Engineering
Jitendra Tugnait, James B. Davis and Alumni Professor of Electrical and Computer
Engineering
Mark Nelms, Professor and Chair of Electrical and Computer Engineering
Abstract
Nowadays, network-on-chip (NoC) systems are becoming more popular due to their big
advantages when compare with systems-on-chip (SoC). Therefore, an increasing number of
researchers and organizations now focus on the study and development of NoC techniques.
As a result, so far many achievements have been gained. Furthermore, considering the
dominant position of wireless and the weakness of wired communication, people also turn to
try to insert wireless links in NoC systems in order to solve the multi-hop problem.
This report gives a brief description of some outstanding developments of NoC and
WNoC (Wireless NoC) systems, including some important technique and the achieved re-
sults, mainly related to the required hardware and communication protocols. In addition,
the report also contains my experiments on NoC and WNoC systems. I use the Booksim
simulator to measure their performances and make some comparisons, and then give some
analysis and conclusion of those results. At the end the report summarizes the nished work
and gives some more developing directions of NoC and WNoC systems.
ii
Acknowledgments
I would like to express my gratitude to people who helped me in this project.
The rst person I must thanks is my advisor Dr. Vishwani D. Agrawal, the James J.
Danaher Professor at Electrical Engineering Department. During the process of the project,
he lent me so much help. When I met problems, he would guide me to solve those problems,
either through emails or in face to face meetings, he always helped me with lot of patience.
Dr. Agrawals kindness and patience made me become interested in this project and provided
me much energy to do my work and also gave me a hope when I felt frustrated.
I am also deeply grateful to my committee members, Dr. Shiwen Mao and Dr. Jitendra
Tugnait for their great teaching and patience during my study.
Another two people I must thanks to are Dr. Alireza Babaei at Auburn and Dr. Partha
Pratim Pande from Washington State University. Dr. Babaei helped throughout the project.
Dr. Pande oered me much useful references. From his help and references, I gained a
signicant amount information that I needed.
I greatly appreciate my colleague and friend, Suraj Sindia, for his help about how to
use latex easily.
Last but not least, I must thank my parents. Though they are no experts in this area,
they encourage me so mush that when I meet diculties, no matter how big they are, I never
lose hope and at last I can overcome them.
Many thanks to my dearest friends in Auburn. Without them, my life in Auburn could
not have been so wonderful.
iii
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Problems in Traditional NoC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Some Solutions and Motivation for Studying Wireless NoC . . . . . . . . . . . . 6
3.1 Ultra Wide Band (UWB) Based WNoC System . . . . . . . . . . . . . . . . 7
3.2 A mm-Wave WNoC System . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 CNT Based WNoC System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Other Needed Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Technique Used in a WNoC System . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1 Structure and Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1.1 Structure of a Pure Wireless NoC System . . . . . . . . . . . . . . . 12
4.1.2 Structure of Hybrid Wireless NoC System . . . . . . . . . . . . . . . 13
4.2 Wireless Link Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Routing and Communication Protocol . . . . . . . . . . . . . . . . . . . . . 14
4.3.1 Protocols in Pure Wireless NoC System . . . . . . . . . . . . . . . . . 15
4.3.2 Protocols in a Hybrid Wireless NoC System . . . . . . . . . . . . . . 17
5 A Simulator for NoC System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6 Factors and Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7 Experimental Processes, Results and Analysis . . . . . . . . . . . . . . . . . . . 25
7.1 Experimental Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
iv
7.2 Experimental Result and Analysis . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2.1 Latency vs. Injection Rate . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2.2 Latency vs. Virtual Channels . . . . . . . . . . . . . . . . . . . . . . 36
7.2.3 An Analysis of WNoC System . . . . . . . . . . . . . . . . . . . . . . 38
8 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
8.1 Work Completed in This Report . . . . . . . . . . . . . . . . . . . . . . . . . 40
8.1.1 Device-Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
8.1.2 System-Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.1.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
v
List of Figures
7.1 Example conguration le for simulating a mesh NoC system. . . . . . . . . . . 26
7.2 Simulator output from running the examples/mesh88 lat conguration le. . . . 27
7.3 A 4 4 mesh topology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.4 Latency vs. injection rate in mesh topology. . . . . . . . . . . . . . . . . . . . . 29
7.5 A 4 4 mesh topology with concentration 4. . . . . . . . . . . . . . . . . . . . . 30
7.6 Average hops in 8 8 mesh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.7 Average hops in 8 8 mesh with concentration 4. . . . . . . . . . . . . . . . . . 31
7.8 Latency vs. injection rate in cmesh topology. . . . . . . . . . . . . . . . . . . . 32
7.9 Comparison between regular mesh and concentrated mesh. . . . . . . . . . . . . 33
7.10 (a) Flattened buttery topology consisting of 64 nodes; (b) corresponding router
layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7.11 Latency vs. injection rate in attened buttery topology consisting of (a) 64
nodes and (b) 256 nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.12 Simulated latency vs. number of virtual channels for (a) regular mesh and (b)
attened buttery topologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.13 Comparison between regular mesh and attened buttery with (a) 4 virtual chan-
nels, (b) 8 virtual channels and (c) 16 virtual channels. . . . . . . . . . . . . . . 39
vi
List of Tables
3.1 WNoC system parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
vii
Chapter 1
Background
Many publications [4], [28], [55] state that the emergence of SoC (System-on-Chip)
platforms consisting of a larger number of embedded processors is a outstanding solution
for some industrial tasks. As described in [23], [76], an SoC is a type of micro system that
integrates many components like processor cores, DSP cores, memories (or storage of control
interface that may be out of the chip) and many other hardware cores, which otherwise
perform specialized tasks on separate single dies. However, according to [23], [65], [76], there
are two limitations of SoC: rst of all, the communication among the Intellectual Property
(IP) blocks impedes the development at the system level, aggravating the complexity issue.
The global wire delays are the most important factors that typically do not scale with
technology scaling even after we insert repeaters [29]. Another problem with SoC is that, SoC
always integrates several dierent hardware cores for dierent applications on the same chip,
but those applications always have diverse requirements such as the dierent communication
standards and specic design constraints. Therefore, there may be diculties when designing
a common SoC for many applications.
In order to solve the problems stated above, NoC (Network-on-chip) is a good paradigm
[3], [4], [23], [64], [76]. NoC is an integrated network that uses routers to allow the commu-
nication among those blocks. It makes use of networking theory and methods for on-chip
communication so that the blocks can exchange information on a chip just like what the
terminals do in the actual world. The so called blocks here will always refer to processors
and caches and, for simplify, a block will be renamed as node in the remaining report.
In a system, the distribution of nodes complies with certain specic topologies. Some
popular topologies have been studied, such as SPIN (Scalable, Programmable, Integrated
1
Network), CLICH (Chip-Level Integration of Communicating Heterogeneous Elements), reg-
ular mesh, Torus, Fold torus, Octagon and BFT (Buttery Fat Tree) [26], [65]. Each node
communicates through routers and since there must be a numbers of routers integrated on
a centimeter-size chip, the size of router should be small enough and the structure should
be as simple as possible. In general, a router in NoC system consists of 1) buers to store
data temporarily, 2) arbiters to decide the sequence of data transmission and 3) switches to
transfer data in the right direction. So far, the wormhole router is the most popular router
used in NoC systems. The advantage of this type of router is that data can be transmitted
more uently. Corresponding to the wormhole router, the transmitted data unit in NoC
is usually called it. A it contains dozens of bits. Each it has a header, which con-
tains the address of the destination, and it is followed by a string of data bits. When being
transmitted, the it runs like a stream through the routers.
A link in NoC is point to point. The common communication between two nodes
is generally based on packet-switching, although there exist other NoC proposals utilizing
circuit-switching techniques. As each channel that connects nodes is duplex and shared by
only two nodes, the NoC improves the performance over the traditional system, which uses
shared-bus to transmit data.
In a NoC system, time latency, power consumption and throughput are the main pa-
rameters used to evaluate its performance. Latency is dened as the time (in clock cycle)
that elapses between the occurrence of a message header injection into the network at the
source node and the reception of the tail of message at the destination node [63]. Therefore,
for a given message, the latency L is given by [65]:
L = sender overhead + transport latency + receiver overhead (1.1)
The transport latency is tiny so that we can ignore it. Send overhead and receiver
overhead mainly depend on the message waiting time in a router, which is in proportion
to the number of nodes on a single chip times the injection rate. For example, if a node
2
requires to transmit data to another node, the data must pass through all routers along the
path between those two nodes. Although a path can be readily determined by any kinds of
routing protocol, the transmitted data must encounter every router on the path. Assuming
a standard mesh network, if node A, positioned is at (1, 1), intends to transmit data to node
B at position (N, M), where (N > 2, M > 2), with some high probability there can be other
transmitting paths that cut across the long path between nodes A and B. That means the
routers in the path from node A to node B will be used by other transmissions at the same
time. Since a router can process only one transmission at a time, when several transmissions
come to the same router, only one of them can be processed immediately while the others
wait until the router is free. In such cases, a high latency in the NoC system is encountered.
Power consumption is another parameter that aects the performance of an NoC system.
Since there are many node-to-node hops in a path and each hop consumes some energy, the
multi-hop must cost more total energy. For each hop, router is the main source of the power
consumption. When a it passes by a router, it will be stored in an input buer to wait
for being processed. If there are more than one its stored in dierent input-buers, an
arbiter should decode the destination information of each it and decide which it to send
rst according to the routing protocol being used. Then, the chosen it is transmitted from
input-buer to output-buer through a crossbars switch. If there is only one it arrives at
the output of the router, it can be sent directly. But if there are several its they have to
be stored in the output buer temporally until their turn. The power dissipated in a NoC
can be calculated as follows [44]:
P =

Pr buf + Pr arbiter + Pr crossbar + Pr link (1.2)


Pr buf is the average power consumption in buers including both dynamic and static
power; Pr arbiter is the average power consumption in routing computation; Pr crossbar is
the average power consumption in switches used in data transmission from input to output
3
through a router; and Pr link is the average link power consumption between two neighbor-
ing routers. Therefore, it is obvious that the total power consumption will be large if there
are many hops in the transmission path.
Throughput is another performance parameter, dened by the amount of data arriving
successfully at the destination in one unit of time. There are many ways to dene it depending
on the specics of the implementation. In general, the throughput T is dened as [65]:
T =
(Total messages completed) (Average message length)
(Number of IP blocks) (Total time)
(1.3)
Where total messages completed is the number of messages that have arrived at the desti-
nation node successfully; the average message length is in its (in this report, we assume a
it width of 20 bits); number of IP blocks refers to the number of routers that the message
passes by and the total time is the whole time the message spends in transmission, which is
equal to the time latency in 1.1. Thus, we see that if the number of IP blocks and total time
increase, the throughput will decrease.
4
Chapter 2
Problems in Traditional NoC
According to publications [25], [62], though an NoC has many important advantages,
it has a serious performance limitation due to the planar metal interconnects essentially re-
quiring multi-hop communication between any non-neighbor nodes. This causes high latency
and power consumption. In other words, the large number of functional nodes is the main
reason that limits the performance of a traditional NoC with wired communication. In the
future, the number of processors and memories in a single multi-core system will increase
to hundreds or even thousands [34]. Still using wired line as the communication channel
may cause a high latency, high power consumption and low throughput. To be specic, the
longer the path, more interference from other crossing paths will increase latency and power
to unreasonable amounts. In addition, the throughput will decrease. If there are hundreds
or thousands of nodes integrated on a single chip, the severity of the problem should become
evident.
5
Chapter 3
Some Solutions and Motivation for Studying Wireless NoC
In order to solve the problems outlined in the previous chapter, several solutions have
been proposed. In this chapter, I will give a brief introduction about the roadmap of the
development and some outstanding solutions for such problems. In addition, some represen-
tative examples will be discussed. However, the description here will serve as an overview
with detail of the technique appearing in the next chapter.
Some solutions, for examples, insert long-range direct links in a regular mesh network
by using conventional metal wires [61] to provide ultralow-latency and low-power express
channels between selected nodes [46], [48]. However, such solutions are still based on wired
communication channels. Although this seems to be a budding methodologies that signi-
cantly improves performance when compared with an NoC system using the traditional wire
channel, it is not enough if the number of nodes on a chip increases to thousand or more
and the injection load is very high. According to the International Technology Roadmap for
Semiconductors (ITRS) [35] for the longer term, only improving in metal wire characteris-
tics will not satisfy the performance requirements and new interconnection paradigms are
needed [25].
Wireless oers an innovative solution. As mentioned, if we can decrease the number of
hops in a transmission path of a message, we may improve the performance in time latency,
power consumption and throughout. By using wireless, the transmission range can increase.
Therefore, for the same given transmission distance between two nodes, a message may go
through smaller numbers of wireless nodes.
Consequently, some approaches worth considering are 3D and photonic NoCs and NoC
architectures with multiband RF interconnect (RF-I) transmission lines [11], [69], [77], [99].
6
The basic idea of these alternative solutions is to insert some kind of express transmitting
channels to reduce the latency and power consumption. Although they can improve the
performance over that of any traditional NoC system without doubt, there is a lack of
eective technique to implement the hardware components. For example, the design of
transmitter, receiver, oscillator and lter and a high reliability integrated light source may
be needed. Although CMOS-based technologies can alleviate some manufacturing challenges,
they still need long physical lines to work as wave guides. Such unsolved problems are the
bottleneck in the use of on-chip wireless [12], [62]. So, though such emerging paradigms can
improve the performance, especially in latency and power consumption, of the traditional
wired NoC system to some extent, they are not matured and still need more study and
further research before the new solutions are deployed at a large scale.
That means, people should continue to look for other ways to realize the wireless com-
munication in NoC system. Here, I am going to introduce three leading approaches. Their
classication is based on the transmission method. They are ultra wide band (UWB) - based
WNoC, mm-wave-based WNoC and Carbon nanotube (CNT) - based WNoC system.
3.1 Ultra Wide Band (UWB) Based WNoC System
Dierent from other several developed RF interconnect technologies, UWB is a better
choice since it can achieve the low-power and low-cost implementation [98].
At the transmitter port, UWB-based interconnection uses Gaussian monocycle pulse
(GMP) generator to produce an ultra-short pulse so as to realize an extremely low power
spectral density (PSD). The modulation model used for the transmitted pulse can be pulse
position modulation (PPM) or biphase modulation (BPM). In addition, by using modula-
tion techniques for channelization and separation of users, for example, time-hopped PPM
(TH-PPM) and direct sequence coding, multiple access capability can be provided [71]. For
the received port, it is made up of a wideband low-noise amplier (LNA), a correlator that
7
consist of a multiplier and integrator, an analog-to-digital converter (ADC) and synchro-
nization circuits. It should be mentioned that for the wireless NoC architecture, an ADC
including comparator and inverter buer has been developed [74]. In this way, the perfor-
mance bottleneck for the receiver can be alleviated.
As a type of CMOS-based technology, the ecient integration of on-chip hardware com-
ponents is also a big challenge in UWB. To solve such problems, using higher frequency
instead of short-range wireless communication allows smaller antennas and increases to the
possibility of on-chip integration. As a result, the data rate can get 1.16Gbps for single
channel at a central frequency of 3.6 GHz [12], [24]. The data transmission protocol for an
UWB-based architecture is designed as a cross-layer. Such layers can accomplish most of the
functions of medium access control (MAC) network and transport layer in a traditional open
system interconnection (OSI) layered structure. Due to the characteristic of data transmis-
sion in WNoC, layers in WNoC should be designed specically. For MAC layer, the key task
is solving channel contention and minimizing the collision probability. So, a synchronous and
distributed MAC (SD-MAC) protocol [85], [99], which is based on synchronized time frames,
has been raised to guarantee high eciency, simplicity, robustness, fairness and quality of
service (QoS) capability. The transport layer does not consider the ow control and error
detection, so it just works as an interface between the network independent layer and the
network dependent layers. The main function of the network layer is to choose the data
transmitting path. In some recently studies [97], a simple location-based routing (LAR) has
been used. In this method, the path is only determined by the nodes current, its neighbors
and destinations locations, but not to have to maintain routing tables or network topology,
which saves power in some extent. In addition, to achieve 100% (guaranteed) delivery and
the QoS requirement in on-chip environment, a region-aided routing (RAR) protocol [96]
has been proposed.
8
The use of UWB has its own disadvantages. The achieved transmission range of UWB
based antenna is about 1mm [99], which means in a chip whose area is 20mm20mm, we
will still need multi-hop communication when transmitting data.
3.2 A mm-Wave WNoC System
Another type of architecture used in WNoC uses millimeter-wave interconnections [18].
In order to achieve as much power gain for small area overhead as possible, a kind of zig-
zag antenna [52] is proposed. The data transmission process will also include modula-
tor/demodulator, serializer/de-serializer and amplier. The modulation type used in this
architecture is on-o-key (OOK) modulation requiring the specic modulator and demodu-
lator [50]. Other hardware includes low noise amplier (LNA), which consists of a two-stage
cascade amplier with shunt-peaked load and an output buer. The serializer/deserializer
(SERDES) devices are implemented with an oscillator block and multiplexer (MUX). Also
needed is a single pole double throw switch (SPDT), which can switch between the trans-
mitting and receiving modes of the transceiver.
For this type of WNoC system, the structure can be divided into two levels: several
nodes form a subnet and several such subnets will form the whole WNoC system. Each
subnet connects to another subnet through either wired or wireless communication. Within
a subnet, a node will connect to another node through wired interconnect. Therefore, the
whole WNoC system is a hierarchical architecture with two levels.
A shortcoming of this type of WNoC system is that due to the structure of the antenna
and the simple communication protocol, all communication channels work at the same fre-
quency. In a single-channel link there is only one wireless communication at a time to avoid
interference. In other word, though there are many wireless links in the WNoC system, only
one can be functional while others are in a rest state, which does not sound too ecient.
9
Table 3.1: WNoC system parameters.
UWB-based mm-based CNT-based
bandwidth 3.6GHz (central frequency) Tens of GHz around 500GHz
devices size millimeter order millimeter order micrometer order
transmission range not enough enough enough
number of channels multiple single multiple
3.3 CNT Based WNoC System
Carbon nanotubes (CNT) have proven to be a much better selection for use as antennas
for WNoC [40]. Some resent researches have shown that a single CNT can be used to make
several kinds of circuit components, such as antennas, modulator/demodulator [38], and
transmitter [39]. In [25], the CNT antenna has been used in a WNoC system.
CNT is developed by a chemical vapor deposition (CVD) method. In the fabrication
process, a localized heater is applied to avoid the preexisting CMOS layers being damaged
due to the high temperature CVD. The frequency range of CNT antenna can get to terrahertz
(THz) level, which has been investigated both theoretically and experimentally [40], [8]. If
the frequency range is in THz, the size of antenna will be small enough so that the area
overhead will be small. The bandwidth of CNT antenna can be high, unlike the mm-wave
antenna whose bandwidth is tens of GHz [31]. The bandwidth of CNT antenna is around
500GHz [25], which can aord a higher data rate.
What is more, there are three additional advantages of CNT-based antenna. First, it
can provide excellent directional properties. Second, since the skin eect in CNT can be
ignored even when the operating frequency is so high, the power dissipation of this type of
antenna is quite low [31]. Third, the multiband laser source can help the CNT antenna to
be assigned dierent frequencies in one communicating channel. The Table 3.1 compares the
UWB, mm and CNT based WNoC systems.
On the down side, a CNT-based WNoC system not readily deployable as it needs more
investigation. It though has apparent advantages than the other two options mentioned
10
above, especially its hardware overhead, which is signicantly lower. So in the remainder of
the report, the description of WNoC system is mainly based on this type plus some additional
information based on other wireless types.
3.4 Other Needed Devices
Besides antennas, other devices are also indispensable for an integrated WNoC system.
For years of study and research and fast development of fabrication technologies, they may
proved to be practical.
The high-speed silicon integrated Mach-Zehnder optical modulator and demodulator
are commercially available currently [27], which can convert signals between electrical and
optical types.
For an oscillator, a kind of 324 GHz oscillator using 90 nm CMOS process [33] and a
410 GHz oscillator using 6M 45 nm CMOS process [75] have been used as on-chip wireless
interconnection transceiver. In [9], a low-power wideband wireless transceiver is designed.
The transmitter circuit consists of an up-conversion mixer and a power amplier. The
receiver, which uses direct-conversion technology, is made up of a low noise amplier, a
baseband amplier and a down-conversion mixer. A voltage-controlled oscillator is used for
both transmitter and receiver.
11
Chapter 4
Technique Used in a WNoC System
This chapter introduce a general architecture of an WNoC system including the topol-
ogy/structure, wireless link insertion and transmission protocol. We provide a summary of
design topologies, communication protocols and routing schemes. Given that there are many
existing options, in this and in later chapters, I will describe some of the most popular ones
according to my recent study.
4.1 Structure and Topology
In WNoC system, the wireless link is a key component without any doubt. In general,
according to the combination of wired and wireless links in a system, the structure can be
divided into two types. One type is with pure wireless links for data transmission between
nodes and the other type contains both wired and wireless links. I should point out that, a
link whether wired or wireless mentioned here only refers to the data transmission channel.
Other connections, for the control channel and channel arbitration, are not considered.
4.1.1 Structure of a Pure Wireless NoC System
Such type of systems mainly depend on UWB transmission [100]. Every data channel
between nodes is wireless (not considering the control link since that does not work on data
transmission). Therefore, if the range of the wireless link is long enough, a node can connect
to any other node in the system by wireless directly, which means every data transmission
will be completed without intermediate hopping. In this condition, the system is formed as
a fully-connected network.
12
However, in a single frequency system (or because the wireless source has limited band-
width), a problem arises with the increase of transmission range: a fully-connected network
also implies each node will use more links, which will lead to greater channel arbitrations.
As a result more latency will be introduced. Additionally, it costs more area overhead for
more reserved wireless nodes. Therefore, a key point in the pure wireless WNoC system is
nding the best transmission range according to several factors, which include the topology,
number of nodes, injection load and routing algorithm.
4.1.2 Structure of Hybrid Wireless NoC System
Resent literature [9], [12], [25], [62] recommend hybrid WNoC more than the pure
WNoC. One reason is the limitation of the frequency band. Another major reason is the
eciency and necessity for transmission between two short-distance nodes. Because the most
valuable characteristic of wireless communication is that it can decrease the number of hops
in a long range transmission, if there is none or just a few hops between two nodes, the
benet of wireless is not so great over the wired transmission. Other reasons, such as the
device area overhead and device integration, are also signicant. In a hybrid Wireless NoC
system, by designing a good combination of the wireless and wired links and suitable system
structure, one can alleviate the above problems.
The most popular structure of the hybrid wireless NoC is a kind of hierarchical structure.
The whole system is divided into several subnets, which can be considered as the bottom
level. Each subnet always contains less than 40 nodes (often the number of nodes in each
subnet will be 8, 16 or sometimes 32). The distance, which is treated as the number of hops,
between nodes within same subnet is short since a subnet can be called a small-world.
Typically, in the small-world, the average path length is no larger than logN, where N is
the number of nodes, and this make the topology so interesting for ecient communication
with minimum resources [7], [82]. Therefore, applying this feature, nodes in subnet are
connected by wires only, which is both ecient in communication and saved the wireless
13
resource. There is a hub in each subnet, whose function is to connect to other subnets
through both wireless and wired channels. The topology of the subnet can be 2-D mesh,
ring or star-ring, which can be 2 by 4, 4 by 4 or 4 by 8.
The top level is made up of all of the subnets. As mentioned above, communication at
the top level is wireless or wired and hubs in the subnets are the connection points. Due
to the limited resource of wireless which will be discussed later, the neighboring subnets are
connected by wire. The topology at this level is always ring, which means all subnets are
arranged in a circle and in every subnet nodes are arranged in 2-D mesh, ring or star-ring.
4.2 Wireless Link Insertion
In a chip, for a small world, though researchers can achieve a number of dierent fre-
quency bands to transmit data simultaneously, they still have not gotten enough frequency
bands for wireless channels when the number of nodes in a WNoC system is large. Moreover,
it is impossible to integrate so many antennas and other wireless interfaces when we consider
the device overhead and integration. Since the distance in subnet is constant once the source
node and destination node are determined, nding a way to allocate the limited number of
wireless link is a core problem in enhancing the performance.
The key task is minimizing the average distance between nodes. One of the ways is
using simulated annealing (SA) [45] to obtain an optimal conguration from placements of
the wireless links and hubs [9], [25], [62]. By using SA, designers get the most optimal pairs
of source hub and destination hub in wireless while other pairs are still wired. In this way,
selecting an optimal conguration for a better performance.
4.3 Routing and Communication Protocol
We will study this in two parts, one is the protocol in pure wireless NoC system and
the other is the hybrid wireless system. In either case, since there are so many proposed
14
solutions, considering the key point to limit the length of this report, we will introduce some
basic and typical proposed methods.
4.3.1 Protocols in Pure Wireless NoC System
LBR Routing Scheme
For RF nodes, the transmission range is determined. A routing algorithm named
Location-Based Routing (LBR) has been proposed [100]. It is a type of static routing
scheme. The basic idea of this algorithm is that in the possible transmitting range, a source
node sends data to the node that is nearest to the destination at every step. A process to
improve the routing eciency partitions the whole net into four parts, namely, quadrants,
depending on the positions of both source node and destination node. By calculating the
dierences of X coordinates and Y coordinates of source node and destination node, we de-
termine which quadrant the destination is in. Next, we nd which neighbor in this quadrant
will bring us closest to the destination.
X-Y Routing Scheme
This algorithm is an adaptation of pure XY-routing. In this scheme, source node trans-
mits data in X direction at rst. When the current nodes X coordinate equals that of the
source, data is sent in Y direction until it reaches the destination node. In both X and Y
routing, node transmits data to the one nearest to the destination.
In this structure, a node may be both a sender and receiver simultaneously, and may
intend to receive from and send to multiple channels at the same time as well. Moreover, a
sender/receiver can send/receive on one channel. Therefore, in order to alleviate the potential
collision in both of the conditions, an arbitration scheme is applied. In this scheme, each
node maintains a request arbitrator and an authorization arbitrator. Considering the point
of view of the receiver, the request arbitrator is used to decide upon a sender to receive data
from, if there are several senders wanting to send data to the same receiver; once the winning
15
sender is found, the authorization arbitrator sends acknowledgment back to the winner.
In the meantime, the winner may also have to send data to others in multiple channels
and may have received several authorizations from dierent receivers. Then, authorization
signals will be used by the sender and only one will be accepted. Finally, the data is sent
out. Additionally, to simplify the requesting arbitration, researchers apply a receiver-based
coding scheme [80].
SD-MAC Protocol
Another proposed protocol is the synchronous and distributed medium access control
(SD-MAC) protocol [99], which is used on MAC layer. In general, the principle of this
protocol is similar to that of the former one. It consists of control and data transmission
parts. One dierence, however, is that the arbitration section, which is called competition
in SD-MAC, is designed to be more specic. The competition is executed by a competition
frame, which is further divided into three parts in the SD-MAC protocol.
The rst part is called the initialization period (INIP). The function of this period is
to record the information of each node and its neighbors. It has two segments. The rst
one records the four states of the current node: transmitter (TX), receiver (RX), inactive
node (InA) and not determined (SND). The state is recorded in a 2-bit register. The second
segment keeps the information about the current nodes potential senders (PSR) and survived
senders (SSR). The potential senders are nodes intending to send data to the current node,
whereas the survived sender is the winner among the potential senders. This segment is
represented by an N-bit register, where N is the number of potential senders.
Following the INIP is the contention period (CP). This period works on the channel
contention process, which mainly focuses on providing the necessary data to decide upon the
survived sender. In other word, makes preparation for the channel access. In this interval,
senders generate an N-bit random number to help make the decision; and by calculating
16
bitwise AND of RX and SSR and updating the SSR register round by round, one gets the
nal winner - the only survived sender.
The last part of the competition frame is channel access authorization period (CAAP).
This period truly is the time to process the decision described above. When the CAA period
nishes, the link is built. The SD-MAC protocol can avoid collision very well. Additionally,
it can solve the exposed terminal problem when data is transmitted in parallel [99].
4.3.2 Protocols in a Hybrid Wireless NoC System
Since there are both wired (within the subnet or between subnets) and wireless (between
subnets) links in a hybrid wireless NoC system, the protocol and routing algorithm usually
depends on these two types of communication.
Routing Scheme Based on Comparison of the Path Length
A simple routing protocol described in [12] is based on comparing the numbers of hops
between the wireless and wired links.
If the two nodes are in the same subnet, they should connect through wire only. If two
nodes are in dierent subnets, the source node will select a method according to a signal sent
via wire from the wireless router in the subnet where the source node is located. Apparently,
the purpose of the signal is to notify whether the intending wireless link is busy or idle. If
the wireless link is busy, data should be sent by wire; if it is not, we proceed to the next step.
The second step targets on balancing the usage of wired and wireless links. The traveling
distance of both wired and wireless links is counted in number of hops. Before choosing
the communication method, source node calculates the distances through wireless and wired
links by using the destination ID and uses the smaller of the two.
Obviously, when nodes are connected only by wire, the data will traverse the wired
links; if nodes use wireless links, source node should still link to the wireless router in the
same subnet via wire at rst, and then transmit data by using wireless.
17
Adopting a Routing Scheme
In [9], a method for adopting a routing strategy is presented for a mm-based WNoC
system. For transmission in the subnet, two types of topologies are discussed, mesh and
star-ring. In mesh subnet, a deadlock-free dimension order routing is used. In the star-ring
topology, two sub-conditions are discussed. If the distance, say, the number of hops, is less
than two, the routing path is along the ring. If data must travel more than two hops, the
it should go through the center node so that the total distance can decrease to two. In
order to avoid deadlock in this kind of topology, virtual channel management scheme from
Red Rover algorithm [20] is adopted. In this algorithm, the whole ring is divided into two
equal sets of nodes and each set has its own virtual channels. For communication at the
top level, say, between subnets, the condition is classied by the usage of wireless interface
(WI). If both the subnets that source node and destination node are in have WI, this pair
will connect in wireless link. If only one of them has WI whereas the other one does not, the
data will be sent to the nearest subnet that has a WI and then it will go to the destination
(for condition that source subnet does not have WI but destination subnet has), or data will
be sent to the WI subnet which is nearest to destination subnet and then to the destination
on wire (for condition that destination subnet does not have WI but source subnet has). If
neither of the source nor the destination have WI, routing path will be chosen as the smaller
one among the wireless link and wired link. However, in this situation, the WI will be hot
spots. Therefore, a token ow control [47] is used to resolve this problem. Basically, tokens
reect the states of the buers of each input of a WI node. Every input has a token and
if the token is greater than a xed threshold, then the token will turn on and indicate that
this input is available, whereas the token will turn o if it is smaller than the threshold and
tell the sender that the corresponding input is busy. The routing strategy used at this level
is also classied as two kinds: for hubs that do not have WIs, a dimension order routing is
adopted; for hubs that have WIs, South-East routing algorithm, which has been proven to
be deadlock-free [61] is used.
18
In this architecture, all wireless communications are in the same channel. To avoid
the interference and alleviate the channel contention, an arbitration mechanism is designed
to guarantee that the source node can reach its true destination. A token strategy is also
proposed here. The dierence between this token method and the one described above is
that the wireless token can broadcast data in its into the wireless medium so that all other
WI antennas will receive the signal. The one that matches the destination address will use
the channel. Then, the wireless token will go to the next hub that needs arbitration.
Multi-Channel Protocol and Routing Scheme
In [99], dierent frequency bands can be assigned by using multiband laser source that
excites CNT antennas. Thus, an opportunity to use frequency division multiplexing (FDM)
to create the dedicated channels between source and destination node comes up. It is realized
by using CNTs of dierent lengths. Additionally, some researchers [40], [32] have demon-
strated that this kind of CNT has a high directional property, which is very suitable for using
in directed channels. The number of dierent frequencies available for dedicated channels
has been reported as 24 [49], which means 24 wireless channels can be used simultane-
ously in a single WNoC system without any frequency interference and channel contention.
Besides FDM, time division multiplexing (TDM) is adopted in each frequency band. To
sum up, in this structure, a wireless antenna uses one of the twenty-four frequencies, and
in the dedicated frequency band, the antenna sends data, whose destinations are dierent,
by using several time slots. The modulation scheme in this system is non-coherent on-o
key (OOK), which leaves the system with no necessity to have complex clock recovery and
synchronization circuits.
In a specic case reported in [99], for a top level with ring topology, a it, the unit
of transmitted data, is assumed to contain 32 bits and the number of time slots is 8 per
dedicated channel. Therefore, each time slot contains 4 bits. With the optical modulator
proposed in [27], 10 Gbps data rate can be provided per channel. Therefore, in this case, the
19
length of each time slot is 0.1 ns, corresponding to the 10 Gbps. The number of frequency
bands is 4. That means, in the specic WNoC system, the bits in each time slot can be
transmitted over four channels.
For each transmission, the path is predetermined at the source hub so that there is no
possibility of deadlock. The information of the path, such as the address of the intermediate
hub, is kept in the header it, which is followed by the remaining data its. When sending
the data, it is just likes a worm along the path. Since the wireless link can be treated as
a short cut when compared to the wired link, a it may chose more than one wireless links,
initially. Though in this system, there are several simultaneous wireless channels, it is still
hardly comparable in size to the functional nodes. In order to alleviate hot spots and get
a best trade-o between router complexity and network performance [99], only a path with
one wireless link will be selected as the true transmission path. When there are several paths
with the same number of hops, the one with the wireless link is chosen, considering the low
power consumption of wireless communication.
Another routing mechanism that contrasts the above one can also be incorporated with
the combination of FDM and TDM scheme. For this routing approach, the path is not
determined at the source node in once. Instead, at each current node, an evaluation is done
to select a possible next best step.
20
Chapter 5
A Simulator for NoC System
Currently, there is no existing general purpose simulator for NoC systems. Most simu-
lators are designed and used by users for their specic purpose. These simulators are always
cycle accurate. Some of them are developed based on SystemC.
SystemC is maintained and developed by OSCI (Open SystemC Initiative). It is a kind
of system-level description language that is based on C++. SystemC is built on a series of
C++ class structures. In fact, it is a library in C++. SystemC is highly suitable for the
system-level simulation.
There are also some open source simulators on the Internet. One of them is named
Noxim [60], which is developed by University of Catania, Italy. This simulator is written
using C++ language, which is mainly based on SystemC. By using Noxim, users can get
some basic parameters of a NoC system, such as throughput, delay and power consumption.
Noxim should be run in Linux environment. However, after some discussion between the
author and others, we found some errors during the compilation if the version of g++ (a
compiler of C++) is beyond g++ 4.0. The reason may be that there is a conict between
SystemC and g++, which is a bug of SystemC.
Nirgam is another cycle accurate simulator for NoC, which is also based on SystemC.
By conrming topology, switching mechanism, buer, clock rate, routing algorithm and
other parameters that depend on the objective of simulation, user can get the latency and
throughput of an NoC system. Nirgam also runs in the Linux environment.
The above mentioned simulators focus on traditional NoC systems. To simulate a WNoC
system, some necessary changes should be made since there are dierences between the
WNoC and NoC architectures, such as the channels character, number of nodes, combination
21
of protocols (intra- and inter- subnet) used in the system and routing algorithms. For
example, WNoC platform used in [12] contains a 100-core system. The system is further
divided into subnets and each of them is a 55 mesh network. To connect pairs of neighboring
subnets, the author used the high-speed wireless link, which is dierent from the wired link
in channel character. In each subnet, 24 wired routers are employed, which will use dierent
routing algorithm and protocol from a traditional NoC system. Still, there is no general
simulator for WNoC for public use. Most organizations use the simulators designed by
themselves for a specic purpose or task.
In this work, I use Booksim interconnection network simulator. Booksim, a cycle-
accurate interconnection network simulator, is developed by Stanford University. The current
major release, Booksim 2.0, supports a wide range of topologies such as mesh, torus and
attened buttery networks, provides diverse routing algorithms and includes numerous
options for customizing the networks router micro architecture [5]. Similar to most NoC
simulators, Booksim is written in C++ and should be run under Linux. However, for windows
user, it can also work with the newest version of Cygwin [15]. To download the simulator, we
should use subversion (SVN) repository to get the source from the web. After the download,
typing make to build the simulator at the /src directory. When using Booksim to
simulate performance at dierent conditions (injection rate, topology, size, etc.), we can
just set the corresponding values in the conguration le and then execute the modied
le. When Booksim is running, the process can be divided into three stages: warm up,
measurement and drain. In practice, the initial state of a network system cannot be ideal,
which means no data is transmitted in the network and all nodes are free. On the other
hand, there must be some functional nodes in the whole system. Therefore, in order to begin
the simulation as what it would be like in practice, the system should be injected with some
data and left running before the simulation experiment starts. That is the function of the
warm up stage. In the measurement stage, the simulator is doing the simulation. Once
the measurement stage is completed, all the measurement packet are drained from the
22
network before the results (latency and throughput) are printed. That is the drain stage.
The details about how to use Booksim will be introduced in Chapter 7.
23
Chapter 6
Factors and Parameters
As mentioned in the previous chapters, the performance of an NoC/WNoC system
mainly contains the throughput, latency and energy consumption. There are many factors
that aecting such parameters.
Factor topology determines the connecting form of the system. Additionally, the size,
or in other words, the number of nodes, can be set in the topology factor. For mesh or
torus topologies, we always use an N N network. To be more specic, we can also set
the number of nodes that share a single router in the topology. Last but not least, for some
kinds of topologies, such as mesh and torus, there is even a probability of link failure, which
can also aect the performance.
Injection rate is the rate at which packets are injected into the simulator. In other words,
injection rate is added to tell the simulator how many packets to inject per simulation cycle
per nodes on an average. If the rate is too high, which means in practice there is a high
injection of data into the WNoC/NoC system, the latency will become huge.
Flow control is another factor that should be considered. It refers to the number of
virtual channels [16] per physical channel and the depth of each virtual channel; the unit is
it. Such factors can relate to the latency and throughput of a system. Routing algorithm
is another key aspect. For instance, if the router itself is deadlock-free, which means it can
use the virtual channels freely, the latency will be smaller; if we have to partition the virtual
channel to avoid the deadlock, there must be some additional latency introduced.
24
Chapter 7
Experimental Processes, Results and Analysis
This experiment contains both wired and wireless NoC systems. Based on the results,
some analysis and estimation of WNoC will be given. Besides considering the size as the
main factor, this experiment mainly focuses on the relationship between the ow control
and the performance, say, latency, so that the results will point out which technical elements
may contribute to it. At last, a comparison between performances of traditional and wireless
NoC systems plus some discussion will be given.
7.1 Experimental Process
Booksim, an interconnection network simulator, is used in this project as mentioned
before. Once download and compile it successful, the user can nd that there are some
important les in the /booksim/trunk/src directory. Folder named examples contains
several conguration les, which contain data on dierent topologies that can be simu-
lated. In each conguration le, there are some parameters should be set as they will aect
the simulation result. By using those les and changing or resetting parameters in them,
the user can simulate the performance of a variety of NoC systems. Another useful le
is booksim config.cpp. It species default values in case the conguration le does not
specify any values.
Before we run the simulation, we may set some parameters of the network, such as the
injection rate. By typing vi [configfile] in /src/examples directory, we can enter
the le to set dierent parameters that we need. [configfile] is the name of the le that
contains the conguration information for the simulator. For example, the command vi
mesh88 lat means that the user can set parameters in the mesh88 lat le, as shown in
25
Figure 7.1: Example conguration le for simulating a mesh NoC system.
Figure 7.1. As this gure shows, in the conguration le there are several parameters, such
as num vcs, vc buf size and injection rate. All of these parameters can be changed
depending on the specic task. To change the parameters that do not appear in conguration
les, the user can set it in the default le, named booksim config.cpp, as was mentioned
previously. If the user changes parameters in the conguration le, it is ne to just save
and quit the le; if the user sets a new value in the default le, he/she should remake the
simulator so that it is guaranteed that the new value has been stored. Here, some important
parameters meanings are dened: num vcs means the number of the virtual channels per
router; vc buf size means the size of buer, the depth is in it; traffic denes the
trac patterns, the option uniform means each source sends an equal amount of data
to destination; for injection rate, its unit can be either it/cycle/node or bit/cycle/node.
Each it contains 20 bits. In this report, we use the former denition. For example, if the
injection rate is 0.15, it means that during each cycle, 0.15 bit is injected in every node.
26
Figure 7.2: Simulator output from running the examples/mesh88 lat conguration le.
Once the parameters have been set, we can begin to simulate. In order to start it,
type command ./booksim [configfile] in the /src directory. Let us take mesh88 lat
as an example. The outcome is shown in Figure 7.2. At the bottom of the gure, the
program prints the average latency, average accepted rate, which means the average
throughput, min accepted rate and average hops. The unit of latency is cycle; the unit
of throughput is it/cycle/node.
Such conguration can generate a single data point. According to Figure 7.2, it can
be a point of injection rate vs. latency line. For this point, it indicates that when the
injection rate is 0.3 bit per cycle per node, the average latency is about 40 cycles. Then, by
running the simulation for many increments of injection rate, an average latency graph
can be obtained.
The next section will examine relationships among injection rate, latency and number
of virtual channels for dierent topologies. Additionally, a discussion about the trade-o
between number of nodes per router and the distance between two nodes is given. At the
end of the section, an analysis of the performance of WNoC system is also presented.
27
Figure 7.3: A 4 4 mesh topology.
7.2 Experimental Result and Analysis
For all of the following simulation on latency, I uniformly used some basic conditions:
the size of each virtual channels buer is 8 its, and trac pattern is uniform. The routing
algorithm is dimension-ordered. Other parameters, such as speed-up in input and output,
delays in each stage and type of allocator, all have default values.
7.2.1 Latency vs. Injection Rate
This part shows the results of simulation of the latency curves for dierent injection rate
levels. The rst topology is mesh, whose structure is illustrated in Figure 7.3. This gure
shows a 4 4 regular mesh topology, the squares represent the nodes, including caches and
processors; the black points represent routers, which are connected by wired line. From the
gure, we can easily nd that each node has its own individual router.
When we simulate the performance by changing the values of injection rate and repeat
the simulation for each case, we can get a series of data points. Connecting them together,
a latency graph is obtained as shown in Figure 7.4 for dierent numbers of virtual channels
(4, 8 and 16). Each latency curve represents the performance for a specic number of virtual
channels. There are 64 nodes in the system, which make up an 8 8 regular mesh topology.
28
0.2 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4 0.42 0.44
50
100
150
200
250
300
350
400
450
500
550
injection rate (bit/cycle)
l
a
t
e
n
c
y

(
c
y
c
l
e
)


Number of virtual channel: 4
Number of virtual channel: 8
Number of virtual channel: 16
Figure 7.4: Latency vs. injection rate in mesh topology.
From the simulation result, we observe that, the latency increases as the injection rate
goes up. The latency starts from about 60 cycles when injection rate is 0.2 bit per cycle. In
general, more virtual channels each router has, lower is the latency in the system. Therefore,
an existing rule is evident here, that is, for a regular mesh topology the utilization of virtual
channel can help the system to reduce the latency. However, if the injection rate is below
0.32 bit per cycle, the number of virtual channels will not aect the latency, or at most,
aects it only mildly. Once the injection rate exceeds 0.36 bit per cycle, the dierence in
latency due to the number of virtual channels becomes more obvious. What is more, for
all three curves, 0.36 is the take o point. Which means, that for an 8 8 regular mesh
topology, when the injection rate exceed 0.36 bit per cycle, the contribution of injection
rate becomes obvious in the form of rapidly increasing system latency. However, there is no
denying that the more virtual channel each router in the system has, the atter the cure
becomes. In addition, when each router has 4 virtual channels and the injection rate goes
29
Figure 7.5: A 4 4 mesh topology with concentration 4.
beyond 0.4 bit per cycle, the systems latency becomes huge and we may treat it as innite
or unusable. Thus, in Figure 7.4, the blue line stops when the injection rate equals 0.4. For
systems that have routers each with 8 16 virtual channels, the maximum injection rates
are set to 0.44. Beyond that level, the latency will be out of control.
A kind of variant of the regular mesh topology is named concentrated mesh topology
(called cmesh for simplicity in this report). A dierent from of regular mesh in which each
node has its own routers, while in concentrated mesh, several nodes share a single router.
The term concentrated refers to how many nodes share a single router. For examples,
if we say a system has concentration 4, that means, in the system, 4 nodes share a single
router. The topology of cmesh with concentration 4 is shown in Figure 7.5.
Here, we nd that each router is shared by four nodes so that the maximum distance
between any two nodes is 2 (count in hops). Comparing with the regular mesh that contains
the same number of nodes and whose maximum distance is 6 as can be veried from Fig-
ure 7.3, we nd that the distance has dropped, signicantly. I used Booksim to simulate the
average hops in both these topologies. Figures 7.6 and 7.7 give simulation results for 8 8
mesh and cmesh. For these two topologies, the longest distances are 14 and 6, respectively.
From the gures, as we can see that in regular 8 8 mesh, the average hops exceeds 6 while
it is around 3 in concentrated mesh (cmesh) topology of the same size.
30
Figure 7.6: Average hops in 8 8 mesh.
Figure 7.7: Average hops in 8 8 mesh with concentration 4.
Having seen the dierence in average hops for concentrated mesh and regular mesh
topologies by running the cmeshconfig le, we then examined the performance of the
concentrated mesh. This is shown in Figure 7.8, Once again, the gure is a latency curve
for dierent values of injection rates. Dierent lines stand for dierent sizes of the system.
In each system, every router has 16 virtual channels. The general trend is the same as for
the regular mesh topology. Higher injection rate leads to higher latency. When the injection
rate is small, say, not larger than 0.11 bit per cycle, the latencies in systems of dierent sizes
are almost same, i.e., around 25 cycles. For dierent sizes, the bounds of injection rate are
dierent: in a 64-node system, the maximum injection rate is 0.26 bit per cycle, in 144-node
system, it is 0.17 bit per cycle; while in 256-node system, the maximum accepted injection
rate is only 0.13 bit per cycle. For a specic injection rate, the 64-node system has the
smallest latency while the 256-node system has the largest latency.
The preceding results allow us to verify another existing rule, that is, in the same
topology, the larger the size of the system, larger is latency for the same injection rate.
However, although these three systems have dierent injection rate bounds, their latency
take o points are almost the same, which are all at about 50 cycles. Before that level,
the latencies increase slowly or barely increase; once the latencies exceed 50 cycles, they go
up dramatically. That says that in a cmesh network, until the injection rate stays below a
specic level, the latency will increase very little in spite of increase in the injection rate.
31
0.1 0.110.120.130.140.150.160.170.180.19 0.2 0.210.220.230.240.250.26
0
50
100
150
200
250
300
350
400
450
injection rate (bit/cycle)
l
a
t
e
n
c
y

(
c
y
c
l
e
)


Number of nodes: 64
Number of nodes: 144
Number of nodes: 256
Figure 7.8: Latency vs. injection rate in cmesh topology.
As the source of the latency is the injection rate of the system, distance between nodes
and the systems concentration are two other factors that lead to increased latency. So, it
is natural to ask, which factor will aect a mesh system the most; or what are the best
trade-o values? In order to nd an answer, further comparison between regular mesh and
concentrated mesh was done. The sizes for both were 64 nodes and each router had 16
virtual channels. In the concentrated mesh system, every router is shared by 4 nodes. The
result is shown in Figure 7.9.
In Figure 7.9, the regular mesh curve is atter than the concentrated mesh curve. Before
injection rate equals 0.24 bit per cycle, the regular mesh has larger latency than concentrated
mesh for every injection rate. Once the injection rate surpasses 0.24, the concentrated mesh
curve starts to drown out the regular mesh curve. Therefore, although in the beginning
regular mesh has a larger latency, its maximum bound of injection rate is much larger than
the concentrated mesh. To sum up, when the injection rate is low, it is better to use a
concentrated mesh; if it is in a heavy load system, a regular mesh system is wiser choice.
32
0.2 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4 0.42 0.44
0
50
100
150
200
250
300
350
400
450
injection rate (bit/cycle)
l
a
t
e
n
c
y

(
c
y
c
l
e
)


nodes per router: 4
nodes per router: 1
Figure 7.9: Comparison between regular mesh and concentrated mesh.
No matter how we improve the structure of a traditional NoC system, the performance
seems not so ideal. Therefore, next, a simulation of wireless NoC System is done under a at-
tened buttery topology. The attened buttery topology is a kind of cost-ecient topology
[43]. The structure of the topology and its router placement are shown in Figure 7.10 [42].
For a 64-node system, each router is shared by 4 nodes. In Figure 7.10(a), each cycle
stands for one node. Four of them are in one group and connect to the same router, which
is represented by a square. For each router, besides connecting to the 4 nodes, it also has
6 more ports. Three of them connect to the other 3 routers in the same dimension, say x
dimension; the other 3 connect to routers in dimension 2, that is y dimension. The router
layout is shown in Figure 7.10(b). Thus, no matter whether routers are in rows or in columns,
all of them are fully connected. Although it is an 8 8 topology, the maximum distance
between any two nodes is no larger than 2. That makes it closer to a wireless environment in
several ways. First, the attened buttery topology is also a kind of concentrated topology
33
Figure 7.10: (a) Flattened buttery topology consisting of 64 nodes; (b) corresponding router
layout.
(the concentration is 4), that is like in a WNoC system, every wireless interface is shared by
several nodes in the same subnet. Second, the maximum distance in attened buttery is 2,
which works on reducing the distance between nodes as the WNoC does. Therefore, I did a
wireless simulation by using attened buttery network.
Figure 7.11 shows the latency curves for dierent injection rates in dierent system size:
one system consists of 64 nodes while the other system has 256 nodes. Still, a comparison of
dierent numbers of virtual channels is included in these graphs. There are some similarities
the two systems. First of all, since the topology in WNoC is also a kind of concentrated
topology (in this experiment, the concentration is 4), shapes of curves in both graphs are
similar as they are in Figure 7.8. Even the places where the take o points appear are
34
0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
0
100
200
300
400
500
600
injection rate (bit/cycle)
l
a
t
e
n
c
y

(
c
y
c
l
e
)


Number of virtual channel: 4
Number of virtual channel: 8
Number of virtual channel: 16
(a)
0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
0
100
200
300
400
500
600
injection rate (bit/cycle)
l
a
t
e
n
c
y

(
c
y
c
l
e
)


number of virtual channels: 4
number of virtual channels: 8
number of virtual channels: 16
(b)
Figure 7.11: Latency vs. injection rate in attened buttery topology consisting of (a) 64
nodes and (b) 256 nodes.
still similar to that in concentrated mesh topology, which is at the level that latency equals
around 50 cycles. Secondly, a big improvement of the WNoC system is the maximum
accepted injection rate. In these systems, the injection rates can get to 0.9 and 0.85 bit
per cycle, respectively. Since the injection rate also has a signicant impact on throughput,
WNoC system will have much higher throughput when compare with traditional NoC. The
tiny dierence on the maximum injection rate is caused by the dierent sizes of these two
systems. This advantage is due to the reduction of the distance between any two nodes
when comparison with other topologies, that is, the largest distance between any nodes is
35
no bigger than 2 in this kind of topology. Therefore, we conclude: if the number of nodes in
a transmission path is smaller, fewer routers will be used and hence smaller latency will be
introduced. Last but not least, when the injection rate remains below some upper bound,
the number of virtual channels will not have so much eect on the latency. In a system
consisting 64 nodes, before injection rate reaches 0.7 bit per cycle, the latencies are almost
the same no mater how many virtual channels a router has. In a system consisting 256
nodes, the dierence happens when the injection rate is 0.5.
However, there is still a dierence between these two gures, it is the eects of the
virtual channel on the injection rates upper bounds. In system consisting 64 nodes, we can
nd that although the three curves reach dierent stop points, they do not leave far away
from each other: the maximum injection rate is 0.8, 0.85 and 0.9 bit per cycle for 4, 8, and
16 virtual channels, respectively. However, in system consisting 256 nodes, there is a big
dierence between 4 and 8 virtual channels. Such phenomenon indicates that the virtual
channel will make more contribution on increasing the maximum injection rate value in a
larger size system. For the above specic examples, the gap happens when virtual channels
amount increase from 4 to 8 in a 256-nodes system. Furthermore, in both gures, the shape
of 8-virtual channels curve likes the shape of 16-virtual channels curve very much. This
result says, when the amount of virtual channels surpass 8, the eect on improve maximum
injection rate will not be so obvious.
From the above comparison we conclude that, considering their latencies and the through-
puts, a WNoC system will provide a big improvement when compared to traditional wired
NoC system.
7.2.2 Latency vs. Virtual Channels
A careful examination of data in the previous section will show that virtual channels
also contribute to latency under certain conditions. Now we examine this issue in detail.
Figure 7.12 shows simulated latency as a function of the number of virtual channels in regular
36
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
50
100
150
200
250
300
350
400
450
number of virtual channels
l
a
t
e
n
c
y

(
c
y
c
l
e
)


injection rate: 0.2
injection rate: 0.25
injection rate: 0.3
injection rate: 0.35
injection rate: 0.4
(a)
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0
50
100
150
200
250
300
350
400
450
500
number of virtual channels
l
a
t
e
n
c
y

(
c
y
c
l
e
)


injection rate: 0.4
injection rate: 0.5
injection rate: 0.6
injection rate: 0.7
injection rate: 0.8
(b)
Figure 7.12: Simulated latency vs. number of virtual channels for (a) regular mesh and (b)
attened buttery topologies.
mesh and attened buttery topologies. The latter can be considered similar to a wireless-
like environment. Both systems have 64 nodes. For either topology, the injection rate is
varied from some maximum value to a value below which the shape of the latency curve
does not change.
An examination of Figure 7.12 reveals that for high injection rates, virtual channels can
bring the latency down. In both topologies, when the injection rate drops to about half of its
maximum value, i.e., 0.25 bit per cycle in regular mesh and 0.5 bit per cycle in wireless-like
topology, increasing the number of virtual channels does not contribute much.
37
Next, we examine dierences between the two topologies. First, the latency for wireless-
like topology shows a more monotonic decrease. But for regular mesh, we see monotonic
decrease with uctuations, especially when the injection rate is high. Another dierence
is that, in regular mesh network when the number of virtual channels increases the best
achievable latency rises with injection rate. However, in the wireless-like network, the best
achievable latency is more or less independent of the injection rate (except for some very
high injection rate) once the number of virtual channels exceeds 4 or 5.
To sum up, when a system has high levels of injection rate, virtual channel will play a
more important role. What is more, in dierent systems the benets vary. In this simulated
experiment, virtual channels are found to be more benecial in a wireless-like network than
in a regular mesh network.
7.2.3 An Analysis of WNoC System
As mentioned above, I treated the attened buttery topology as a representative model
for a wireless-like network. Therefore, although in this experiment all simulations use wired
environment, we can draw some reasonable inferences applicable to wireless NoC systems.
Recall that in attened buttery topology the maximum distance is 2, similar to a link with
two directly-connected nodes in a large network. Further, in WnoC every wireless router
is shared by several, often more than 10, nodes. That is just like a traditional NoC with
concentration, though the router is dierent from the one used in wired NoC system. Indeed,
there are some dierences. For examples, just to name a few communication protocols and
bandwidths dier.
Figure 7.13 shows the simulation of a 64-node system with 4, 8 and 16 virtual channels,
respectively. No matter how many virtual channels are used, the WNoC-like system performs
much better than the traditional NoC system. We observe that at the same injection rate,
the latency of a traditional NoC (blue curve, regular mesh) is much higher than that of
a WNoC-like system (red curve, attened buttery). The maximum permissible injection
38
0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
0
50
100
150
200
250
300
350
400
450
500
injection rate (bit/cycle)
l
a
t
e
n
c
y

(
c
y
c
l
e
)


regular mesh
flattened butterfly
0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
0
100
200
300
400
500
600
injection rate (bit/cycle)
l
a
t
e
n
c
y

(
c
y
c
l
e
)


regular mesh
flattened butterfly
(a) (b)
0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9
0
100
200
300
400
500
600
injection rate (bit/cycle)
l
a
t
e
n
c
y

(
c
y
c
l
e
)


regular mesh
flattened butterfly
(c)
Figure 7.13: Comparison between regular mesh and attened buttery with (a) 4 virtual
channels, (b) 8 virtual channels and (c) 16 virtual channels.
rate for a traditional NoC is much smaller than that for a WNoC-like system. Evidently,
a WNoC system can process much more data in the same time. Additionally, in a WNoC-
like system, the behavior at the take-o point is sharper. That means that, until some
maximum injection rate is reached, any increase in injection rate will almost not eect the
latency. On the other hand, a traditional NoC system displays a continuously rising latency
with increasing injection rate. In a real operational environment where the activity may be
a function of time, these characteristics will produce a higher and more stable throughput
behavior for the WNoC system.
39
Chapter 8
Conclusion and Future Work
As a promising and burgeoning technique, NoC systems have been a hot topic of study by
many researchers and organizations in recent times. Such researches cover from the system-
level down to the device-level and numerous achievements have been reported. Considering
the problems of multi-hop communication and improve the performance of a large NoC
system, the idea of inserting wireless links oers a potential solution. Therefore, our study
of wireless network-on-chip (WNoC) systems is timely.
8.1 Work Completed in This Report
The present report is a survey of the eld. It provides a large list of references. In addi-
tion, simulation experiments and conclusions, some of which may be original, are reported.
For the lack of a simulator for WNoC, we have simulated a wired NoC with buttery topol-
ogy that has an upper bound on the communication distance between nodes. We believe
our conclusions will prove to be valid for WNoC.
8.1.1 Device-Level
For the device-level, the research on the basic components for NoC or WNoC systems be-
gan years ago and progressed through immense advancements of the CMOS technology. For
example, the antenna, one of the most important components in a WNoC system, whether it
is a mm-wave antenna or a CNT-based antenna or any other type of RF antenna, is feasible
today. What is more, beside those, many other kinds of antenna are also be developed [21],
[30], [70], [102]. The main purpose and contribution of such papers is to improve the per-
formance of the antenna. Another key component of NoC and WNoC that have been a
40
focus of research is the router [10], [17], [54], [72], [73], [87], [88]. These researches consider
routers performance and other characteristics such as power consumption, throughput, la-
tency and fault tolerance. The also study routers with various topologies including 3-D.
Nowadays, the wormhole router is popular in NoC or WNoC systems since it can increase
the throughput of the whole system due to its architecture. Last but not least, the utiliza-
tion of virtual channels also makes a signicant contribution to the development of NoC and
WNoC systems. By using the virtual channels, the performances in throughput and latency
can have a huge improvement. Other communications devices have also been developed and
include transceiver circuit [9], [83], [90], which contains transmitter, receiver, modulator and
demodulator. There are also detailed designs available for the switch [65].
8.1.2 System-Level
At the system level, types of topologies, communication protocols, quality of service
(QoS) of the communication, routing schemes and wireless insertion have been worked out
at various levels of detail.
Topology
For WNoC systems, most research has focused on commonly known topologies, such
as 2-D mesh, 3-D mesh, SPIN, folded torus, fat tree, ring and star-ring. Reference [26]
gives a brief review of some of those architectures. The introduction mainly focuses on the
scalability and structure denition. The most popular topology used in research is 2-D mesh
for wired NoC systems. For WNoC systems, 2-D mesh is used for wired parts (commonly
known as subnets) and ring/star-ring is used for wireless parts.
Of course, there are other topologies for an NoC system. References [6], [14] and [42]
study the tree topology (fat tree and atted buttery tree), focusing on throughout, latency
and energy eciency. Deadlock recovery and low power consumption with torus topology
41
is proposed in [78]. The Clos interconnection network is studied in [94], and this paper
contributes to improvements in the performance of NoC systems.
Beside those common topologies, other topologies are proposed and studied for specic
purposes [1], [36], [41], [79], [84], [93]. The purpose can be throughput, mapping, routing
algorithm, power consumption or link optimization. Although these topologies are not so
common as those that focus on the whole performance of the system, they signicantly
enhance their focus area. Among them, research on the reconguration for network-on-chip
can be considered signicant [53], [59], [81]. A key observation made there is that an NoC
system with recongurable topology has greater potential for latency reduction, low power
and area eciency.
Communication Protocol and Routing Algorithm
Due to the limitation of the area and the device complexity in NoC systems, the com-
munication protocol should be as simple and and as eciency as possible. Most of the
protocols are somewhat like the open source interconnect (OSI) layer communication pro-
tocol. In other words, they are designed in a layered fashion [19], [58]. Some typical layers
are called application, network control, network and physical. There is no denying that
they are designed to improve the NoC systems performance, especially for the throughput
and latency. In reference [103], an easy to implement and reusable communication protocol
whose purpose is to reduce the congestion is proposed. It also introduces a kind of adaptive
and deadlock-free routing algorithm. What is more, some of them are also quality of service
(QoS) aware [13], [99]. In [99], the medium access control (MAC) protocol is also collision-
free, and the protocol in [13] is implemented with time division multiplexing (TDM). Other
protocols, such as the one in [2], are designed according to the task structure.
Similar as the communication protocol, routing scheme in NoC systems should also be
simple. The routing schemes can be divided into two types, static and dynamic. Based on
specic conditions, routing algorithms are designed. The main purpose of a routing scheme
42
is to nd the possible shortest path between two nodes. Therefore, some papers [86] provide
complex algorithms. In [92], a kind of power-aware adaptive routing scheme is invented. It
is a dynamic XY routing algorithm, which can decrease the power consumption. Since the
load imbalance is also a source that can impact the performance of an NoC system, a kind of
routing algorithm that works on load balancing is proposed in [95], which is also an adaptive
routing scheme. In [56], [66], [67], [68], [91], routing schemes that can increase the fault
tolerance of the whole system are mentioned. While energy eciency has been considered,
[89] is a study of deadlock-free routing algorithms. For multicasting, there is a specic
routing scheme, which is described in [51]. It is a look-ahead router and the NoC system can
save area by over 50%. Last but not least, a type of region-based routing mechanism [57]
is also designed for NoC. Despite so many routing schemes, for NoC systems the wormhole
router (always including virtual channels) is the most widely used router due to its high
throughput. Therefore, in research that focuses on performance, the wormhole router is the
default.
Wireless Insertion
In a WNoC system, the number of wireless links is not comparable with the number
of nodes. As a result, if we want to use several wireless links simultaneously methods to
allocate wireless links is an important topic. In [9], [99], [62], simulated annealing (SA) is
used to allocate wireless links. The solution sought in SA is the shortest distance in number
of hops between all hub pairs. Another procedure, named evaluation algorithms (EAs) [22],
[62] inserts a limited number of wireless links to a large number of nodes. Another design
[101] uses a combination of simulated annealing and evaluation algorithm for a larger-scale
transit route network optimization. The main dierence between SA and EAs is that EAs
can nd better solutions while SA can get comparably good result faster [37].
43
8.1.3 Performance
Many researchers give the performance of WNoC systems. However, for systems using
diering topologies, communication protocols, number of functional nodes and other at-
tributes, the performances can be quite dierent. Here, we point out the best performances
that have been achieved, classied according to the communication method used.
For a WNoC system that has 512 nodes, when CNT-based antennas are used, i.e., there
can be up to 24 simultaneous wireless links, the maximum throughput can be about 0.7 its
per core per cycle, where it is the transmission unit that contains 32 bits. For a 256-node
WNoC system, the throughput can also get the same level when the load injection is larger
than 0.7. For the NoC system with the same size, the throughputs are 0.1 its (512 nodes)
and 0.2 its (256 nodes) per core per cycle. The best condition of packet energy consumption
in a 512-nodes WNoC system is less than 40nJ when the there are 24 wireless links. For the
same condition, the packet energy consumption in a 256-nodes WNoC system is about 25nJ.
Performance with mm-wave antennas is somewhat dierent from the CNT-based an-
tenna WNoC system. In a 512-core system, in the best condition, say, mesh topology for
subnet and star-ring topology for top level, the least packet energy is in range 100nJ to
105nJ, which is consumed when the number of wireless interfaces is in the range 2 to 10.
With the same condition (also the best condition), in a 256-core WNoC system, the least
packet energy is about 90nJ when the number of interfaces is in range 2 to 6.
Since the number of wireless links is limited, there must be an upper bound on the total
number of the nodes (referring to functional nodes) in a WNoC system. When the number of
node goes beyond this bound, the performance will not improve, or even decrease as latency
rises. In [9] and [25], the authors indicate that the upper bound is around 512 for a WNoC
system according to the techniques available thus far.
44
8.2 Future Work
To sum up, there are four aspects of WNoC systems that need further development.
These are outlined in this section.
First of all, although the CMOS technology and fabrication techniques are very advanced
and there are so many system-on-chip (SoC) devices already exist, the mm-scale RF devices
for WNoC still have a huge developed gap. To be specic, we need to increase the number of
wireless links that can work simultaneously. This may relate to the manufacturing methods.
Other devices, especially router, are still being studied by many organizations. Again, to be
more specic, for a router designing a superior arbiter, better structure and channel allocating
mechanism and an improved switch function can all contribute to enhanced performance for
the whole WNoC system.
Additionally, for the system level feasible techniques will need be developed for opti-
mizing the WNoC architectures. More potential protocols are waiting to be designed in the
future. The protocols include the communication protocol and topology protocol. At the
topology level, there is no consensus about which is the best, although some popular kinds
and their combinations have been used and studied. Maybe there is a best choice for a
mesh network. However, for some specic tasks, we would need other special and unique
topologies. For communication protocol, though to some extent one can borrow concepts and
techniques from computer networks, WNoC has its own needs: small and simple. Therefore,
it requires the schemes to be simple. Most of the existing communication protocols are lay-
ered, so, in the future, layered and packet-based communication protocol can be a direction
to move. Besides, a requirement of QoS is also to guarantee transmission quality.
Finally, fault tolerance and reconguration are two other aspects to be developed. These
may relate to the system architecture and protocols.
To be frank, comparing with an NoC system, although a WNoC system has much
improved performance, the techniques for WNoC are not so mature as they are for NoC
systems. Still, NoC is the dominant research area. Notably however, there is an increasing
45
number of studies on WNoC systems and the relevant techniques for them are becoming
available, gradually.
46
Bibliography
[1] C. Ababei, Ecient Congestion-Oriented Custom Network-on-Chip Topology Synthesis,
in Proc. International Conf. Recongurable Computing and FPGAs (ReConFig), 2010, pp.
352357.
[2] N. Bagherzadeh and M. Matsuura, Performance Impact of Task-to-Task Communication
Protocol in Network-on-Chip, in Proc. Fifth International Conf. Information Technology:
New Generations, 2008, pp. 11011106.
[3] L. Benini and D. Bertozzi, Network-on-Chip Architectures and Design Methods, in In
Proc. Computers and Digital Techniques, Mar. 2005, pp. 261272.
[4] L. Benini and G. DeMicheli, Networks on Chips: A New SoC Paradigm, Computer, vol. 35,
no. 1, pp. 7078, Jan. 2002.
[5] Booksim. https://nocs.stanford.edu/cgi-bin/trac.cgi/wiki/Resources/BookSim.
[6] A. Bouhraoua, O. Diraneyya, and M. E. Elrabaa, A Simplied Router Architecture for the
Modied Fat Tree Network-on-Chip Topology, in Proc. NORCHIP, 2009, pp. 14.
[7] M. Buchanan, Nexus: Small Worlds and the Groundbreaking Theory of Networks, in
Norton, W. W. Company, 2003.
[8] P. J. Burke et al., Quantitative Theory of Nanowire and Nanotube Antenna Performance,
IEEE, Trans. Nanotechnology, vol. 5, no. 4, pp. 314334, July 2006.
[9] K. Chang, S. Deb, A. Ganguly, X. Yu, S. P. Sah, P. P. Pande, B. Belzer, and D. Heo, Per-
formance Evaluation and Design Trade-Os for Wireless Network-on-Chip Architectures, in
Accepted for publication in ACM Journal on Emerging Technologies in Computing Systems,
Sept. 2011.
[10] Y. Chang, C. Chiu, S. Lin, and C. Liu, On the Design and Analysis of Fault Tolerant NoC
Architecture Using Spare Routers, in Proc. 16th Asia and South Pacic Design Automation
Conference (ASP-DAC), 2011, pp. 431436.
[11] M. F. Chang et al., CMP Network-on-Chip Overlaid with Multi-Band RF-Interconnect,
in Proc. IEEE International Symp. High-Performance Computer Architecture (HPCA), Feb.
2008, pp. 191202.
[12] W. Chifeng, H. Wen-Hsiang, and N. Bagherzadeh, A Wireless Network-on-Chip Design for
Multicore Platforms, in Proc. 19th Euromicro International Conf. Parallel, Distributed and
Network-Based Processing (PDP), Feb. 2011, pp. 409416.
[13] N. Concer, A. Vesco, R. Scopigno, and L. P. Carloni, A Dynamic and Distributed TDM Slot-
Scheduling Protocol for QoS-Oriented Networks-on-Chip, in Proc. IEEE 29th International
Conf. Computer Design (ICCD), 2011, pp. 3138.
47
[14] J. Cong, Y. Huang, and B. Yuan, A Tree-Based Topology Synthesis for on-Chip Network,
in Proc. IEEE/ACM International Conf. Computer-Aided Design (ICCAD), 2011, pp. 651
658.
[15] Cygwin. http://www.cygwin.com/.
[16] W. J. Dally, Virtual-channel Flow Control, IEEE Trans. On Parallel and Distributed sys-
tem, vol. 3, pp. 194205, Mar. 1992.
[17] F. Darve, A. Sheibanyrad, P. Vivet, and F. Petrot, Physical Implementation of an Asyn-
chronous 3D-NoC Router Using Serial Vertical Links, in Proc. IEEE Computer Society
Annual Symp. VLSI (ISVLSI), 2010, pp. 2530.
[18] S. Deb, A. Ganguly, K. Chang, P. P. Pande, B. Belzer, and D. Heo, Enhancing Performance
of Network-on-Chip Architectures with Millimeter-Wave Wireless Interconnects, in Proc.
IEEE international Conference on ASAP, 2010, pp. 7380.
[19] M. Dehyadgari, M. Nickray, A. Afzali-Kusha, and Z. Navabi, A New Protocol Stack Model
for Network on Chip, in Proc. IEEE Computer Society Annual Symp. Emerging VLSI Tech-
nologies and Architectures, 2006.
[20] J. Draper and F. Petrini, Routing in Bidirectional k-ary n-cube switch the Red Rover
Algorithm, in Proc. of the International conference on Parallel and Distributed Processing
Techniques and Applications, 1997, pp. 11841193.
[21] T. Ehsan, T. Mahmoud, and K. Sara, An Optimized Phased-Array Antenna for Intra-Chip
Communications, in Proc. Loughborough Antennas and Propagation Conference (LAPC),
2011, pp. 14.
[22] A. E. Eiben and J. E. Smith, Introduction to Evolutionary Computing, in Springer-Verlag,
(Berlin, Heidelberg), 2003.
[23] F. Fu, S. Sun, J. Song, J. Wang, and M. YU, A NoC Performance Evaluation Platform
Supporting Designs at Multiple Levels of Abstraction, in Proc. 4th IEEE Conference on
Industrial Electronics and Applications, May 2009, pp. 425429.
[24] M. Fukuda, P. K. Saha, N. Sasaki, and T. Kikkawa, A 0.18 CMOS Impulse Radio Based
UWB Transmitter for Global Wireless Interconnections of 3D Staked-Chip System, in Pro.
International Conf. Solid State Devices and Materials, Sept. 2006, pp. 7273.
[25] A. Ganguly, K. Chang, S. Deb, P. P. Pande, B. Belzer, and C. Teuscher, Scalable Hybrid
Wireless Network-on-Chip Architectures for Multicore Systems, Computer, vol. 60, no. 10,
pp. 14851502, Oct. 2011.
[26] C. Grecu, P. P. Pande, A. Ivanov, and R. Saleh, Timing Analysis of Network on Chip
Architectures for MP-SoC Platforms, Microelectronics Journal, vol. 36, no. 9, pp. 833845,
Sept. 2005.
[27] W. M. J. Green et al., Ultra-Compact, Low RF Power, 10Gb/s Silicon Mach-Zehnder Mod-
ulator, Optics Express, vol. 15, no. 25, pp. 1710617113, 2007.
[28] M. Horowitz and W. J. Dally, How Scaling Will Change Processor Architecture, in Proc.
International Solid State Circuits Conf. (ISSCC), Feb. 2004, pp. 132133.
[29] M. A. Horowitz et al., The Feature of Wires, Proc. IEEE, vol. 89, no. 4, pp. 490504, Apr.
2001.
48
[30] J. Huang, J. Wu, Y. Chiou, and J. C. F. Jou, A 24/60GHz Dual-Band Millimeter-Wave
on-Chip Monopole Antenna Fabricated with a 0.13 CMOS Technology, in Proc. IEEE
International Workshop Antenna Technology, 2009, pp. 14.
[31] Y. Huang, W. Y. Yin, and Q. H. Liu, Performance Prediction of Carbon Nanotube Bundle
Dipole Antennas, IEEE Trans. Nanotech., vol. 7, no. 3, pp. 331337, Sept. 2008.
[32] Y. Huang et al., Performance Prediction of Carbon Nanotube Bundle Dipole Antennas,
IEEE Trans. Nanotechnology, vol. 7, no. 3, pp. 331337, May 2008.
[33] D. Hung et al., Terahertz CMOS Frequency Generator Using Linear Superposition Tech-
nique, IEEE Journal of Solid State Circuits, Dec. 2008.
[34] ITRS. International Technology Roadmap for Semiconductors, 2005 edition.
[35] ITRS, 2007. http://www.itrs.net/Links/2007ITRS/Home2007.htm.
[36] M. Janidarmian, V. S. Bokharaie, A. Khademzadeh, and M. Tavanpour, Sorena: New
on Chip Network Topology Featuring Ecient Mapping and Simple Deadlock Free Rout-
ing Algorithm, in Proc. IEEE 10th International Conference Computer and Information
Technology (CIT), 2010, pp. 22902299.
[37] T. Jansen and I. Wegener, A Comparison of Simulated Annealing with a Wimple Evolution-
ary Algorithm on Pseudoboolean Functions of Unitation, in Theoretical Computer Science
386, 2007, pp. 7393.
[38] K. Jensen et al., Nanotube Radio, Nano Letter, vol. 7, pp. 35083511, Oct. 2007.
[39] K. Jensen et al., Nanomechanical Radio Transmitter, Physica Status Solidi B, vol. 245,
no. 10, pp. 23232325, Sept. 2008.
[40] K. Kempa et al., Carbon Nanotubes as Optical Antenna, in Advanced Materials, 2007, pp.
421426.
[41] G. N. Khan and V. Dumitriu, Throughput-Based Network-on-Chip Topology Generation
and Analysis, in Proc. Canadian Conference Electrical and Computer Engineering, CCECE,
2009, pp. 180184.
[42] J. Kim, J. Balfour, and W. J. Dally, Flattened Buttery Topology for On-Chip Networks,
in Proc. 40th Annual IEEE/ACM International Symp. Microarchitecture, MICRO, 2007, pp.
172182.
[43] J. Kim, W. J. Dally, and D. Abts, Flattened Buttery: A Cost-Ecient Topology for High-
Radix Networks, in Proc. International Symposium on Computer Architecture (ISCA), 2007,
pp. 126137.
[44] J. Kim, D. Park, C. Nicopoulos, N. Vijaykrishnan, and C. R. Das, Design and analysis of
an NoC architecture from performance, reliability and energy perspective, in Architecture
for networking and communications systems, 2005. ANCS 2005. Symposium on, Oct. 2005,
pp. 173182.
[45] S. Kirkpatrick et al., Optimization by Simulated Annealing, Scince, vol. 220, pp. 671680,
1983.
[46] T. Krishna et al., NoC with Near-Ideal Express Virtual Channels Using Global Line Com-
munication, in Proc. IEEE Symp. High Performance Interconnects (HOTI), Aug. 2008, pp.
1120.
49
[47] A. Kumar, L.-S. Peh, and N. K. Jha, Token Flow Control, in Proc. 41st IEEE/ACM
International Symposium on Micro Architecture, 2008, pp. 342353.
[48] A. Kumar et al., Toward Ideal On-Chip Communication Using Express Virtual Channels,
IEEE Micro., vol. 28, no. 1, pp. 8090, Feb. 2008.
[49] B. G. Lee et al., Ultrahigh-Bandwidth Silicon Photonic Nanowire Waveguides for On-Chip
Networks, IEEE Photonics Technology Letters, vol. 20, no. 6, pp. 398400, Mar. 2008.
[50] J. Lee et al., A low-power fully integrated 60GHz transceiver system with OOK modulation
and on-board Antenna assembly, in Proceedings of IEEE Solid-State Circuits Conference,
ISSCC 2009, 2009, pp. 316317,317a.
[51] W. Lei, P. Kumar, R. Boyapati, H. Y. Ki, and J. K. Eun, Ecient lookahead routing
and header compression for multicasting in networks-on-chip, in Proc. ACM/IEEE Symp.
Architectures for Networking and Communications Systems (ANCS), 2010, pp. 110.
[52] J. Lin et al., Communication Using Antennas Fabricated in Silicon Integrated Circuits,
IEEE Journal of solid-state circuits, vol. 42, no. 8, pp. 16781687, Aug. 2007.
[53] A. Logvinenko and D. Tutsch, A Reconguration Technique for Area-Ecient Network-
on-Chip Topologies, in Proc. International Symp. Performance Evaluation of Computer &
Telecommunication Systems (SPECTS), 2011, pp. 259264.
[54] Y. Lu, J. McCanny, and S. Sezer, Generic Low-Latency NoC Router Architecture for FPGA
Computing Systems, in Proc. International Conf. Field Programmable Logic and Applica-
tions (FPL), 2011, pp. 8289.
[55] P. Magarhack and P. G. Paulin, System-on-Chip beyond the Nanometer Wall, Proc. Design
Automation Conf. (DAC), pp. 419424, June 2003.
[56] A. Mehranzadeh, A. Khademzadeh, and A. Mehran, FADyAD - Fault and Congestion
Aware Routing Algorithm Based on DyAD Algorithm, in Proc. 5th International Symp.
Telecommunications (IST), 2010, pp. 274279.
[57] A. Mejia, M. Palesi, J. Flich, S. Kumar, P. Lopez, R. Holsmark, and J. Duato, Region-
Based Routing: A Mechanism to Support Ecient Routing Algorithms in NoCs, IEEE
Trans. Very Large Scale Integration (VLSI) Systems, vol. 17, pp. 356369, 2009.
[58] M. Millberg, E. Nilsson, R. Thid, and A. Jantsch, The Nostrum Backbone-A Communica-
tion Protocol Stack for Networks on Chip, in Proc. 17th International Conf. VLSI Design,
2004, pp. 693696.
[59] M. Modarressi, A. Tavakkol, and H. Sarbazi-Azad, Application-Aware Topology Recong-
uration for On-Chip Networks, IEEE Trans. Very Large Scale Integration (VLSI) Systems,
vol. 19, pp. 20102022, 2011.
[60] Noxim. http://noxim.sourceforge.net/.
[61] U. Y. Orgras and R. Marculescu, Its a Small World After All: NoC Performance Opti-
mization via Long-Range Link Insertion, IEEE Trans. Very Large Scale Integration (VLSI)
Systems, vol. 14, no. 7, pp. 693706, July 2006.
[62] P. P. Pande, A. Ganguly, K. Chang, and C. Teuscher, Hybrid Wireless Network on Chip:
A New Paradigm in Multi-Core Design, in Proc. 2nd International Workshop Network on
Chip Architectures, NoCArc, Dec. 2009, pp. 7176.
50
[63] P. P. Pande, C. Grecu, A. Ivanov, and R. Saleh, Design of a Switch for Network on Chip
Applications, in Proc. International Symp. Circuits and Systems (ISCAS), volume 5, May
2003, pp. 217220.
[64] P. P. Pande, C. Grecu, M. Jones, A. Ivanov, and R. Saleh, Eect of Trac Localization on
Energy Dissipation in NoC-based Interconnect, in Proc. IEEE International Symposium on
Circuits and Systems, May 2005, pp. 17741777.
[65] P. P. Pande, C. Grecu, M. Jones, A. Ivanov, and R. Saleh, Performance Evaluation and
Design Trade-Os for Network-on-Chip Interconnect Architectures, Computer, vol. 54, no. 8,
pp. 10251040, Aug. 2005.
[66] S. Pasricha and Z. Yong, A Low Overhead Fault Tolerant Routing Scheme for 3D Networks-
on-Chip, in Proc. 12th International Symp. Quality Electronic Design (ISQED), 2011, pp.
18.
[67] S. Pasricha and Z. Yong, NS-FTR: A Fault Tolerant Routing Scheme for Networks on Chip
With Permanent and Runtime Intermittent Faults, in Proc. 16th Asia and South Pacic
Design Automation Conference (ASP-DAC), 2011, pp. 443448.
[68] S. Pasricha, Z. Yong, D. Connors, and H. J. Siegel, OE+IOE: A Novel Turn Model Based
Fault Tolerant Routing Scheme for Networks-on-Chip, in Proc. IEEE/ACM/IFIP Interna-
tional Conf. Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2010, pp.
8593.
[69] V. F. Pavlidis and E. G. Friedman, 3D Topologies for Networks-on-Chip, IEEE Trans.
Very Large Scale Integration (VLSI) Systems, vol. 15, no. 10, pp. 10811090, Oct. 2007.
[70] K. Payandehjoo and R. Abhari, Characterization of on-Chip Antennas for Millimeter-Wave
Applications, in Proc. IEEE International Symp. Antennas and Propagation, 2009, pp. 14.
[71] R. Qiu, H. Liu, and X. Shen, Ultra-Wideband for Multiple Acess Communication, IEEE
Comm. Magazine, vol. 44, no. 2, pp. 8087, Feb. 2005.
[72] A.-M. Rahmani, M. Daneshtalab, P. Liljeberg, and H. Tenhunen, Power-Aware NoC Router
Using Central Forecasting-Based Dynamic Virtual Channel Allocation, in Proc. IEEE In-
ternational Symposium Circuits and Systems (ISCAS), 2010, pp. 32243227.
[73] R. S. Ramanujam, V. Soteriou, B. Lin, and L.-S. Peh, Design of a High-Throughput
Distributed Shared-Buer NoC Router, in Proc. Fourth ACM/IEEE International Symp.
Networks-on-Chip (NOCS), 2010, pp. 6978.
[74] N. Sasaki, M. Fukuda, M. Nitta, K. Kimoto, and T. Kikkawa, A Single-Chip Ultra-
Wideband Receiver Using Silicon Integrated Antennas for inter-Chip Wireless Interconnec-
tion, in Pro. International Conf. Solid State Devices and Materials, Sept. 2006, pp. 7071.
[75] E. Seok et al., A 410GHz CMOS Push-Push Oscillator With an on-Chip Patch Antenna,
in Proc. International Solid State Circuits Conf. (ISSCC), 2008.
[76] J. Sepulveda, M. Strum, W. Chau, and G. Gogniat, A Multi-Objective Approach for Multi-
Application NoC Mapping, in 2011 IEEE Second Latin American Symposium on, Circuits
and Systems (LASCAS), Feb. 2011, pp. 14.
[77] A. Shacham et al., Photonic Network-on-Chip for Future Generations of Chip Multi-
Processors, IEEE Trans. Computers, vol. 57, no. 9, pp. 12461260, 2008.
51
[78] M. Shin and J. Kim, Leveraging Torus Topology with Deadlock Recovery for Cost-Ecient
on-Chip Network, in Proc. IEEE 29th International Conf. Computer Design (ICCD), 2011,
pp. 2530.
[79] D. A. Siguenza-Tortosa and J. Nurmi, Topology Design for Global Link Optimization in
Application Specic Network-on-Chips, in Proc. International Symp. System-on-Chip, 2004,
pp. 135138.
[80] E. Sousa and J. Silvester, Spreading Code Protocols for Distributed Spread-Spectrum Packet
Radio Networks, IEEE Trans. Communications, vol. 36, no. 3, pp. 272281, Mar. 1988.
[81] M. B. Stensgaard and J. Sparso, ReNoC: A Network-on-Chip Architecture with Recon-
gurable Topology, in Proc. Second ACM/IEEE International Symp. Networks-on-Chip,
NoCS, 2008, pp. 5564.
[82] C. Teuscher, Nature-Inspired Interconnects for Self-Assembled Large-Scale Network-on-
Chip Designs, Chaos, vol. 17, no. 2, 2007.
[83] V. M. Vidya, R. Thilagavathy, and M. Bhaskar, Low Power, High Performance Current
Mode Transceiver for Network-on-Chip Communication, in Proc. Internattional Conf. Sig-
nal Processing, Communication, Computing and Networking Technologies (ICSCCN), 2011,
pp. 223227.
[84] H. Wang, L.-S. Peh, and S. Malik, A Technology-Aware and Energy-Oriented Topology Ex-
ploration for on-Chip Networks, in Proc. Design, Automation and Test in Europe, volume 2,
2005, pp. 12381243.
[85] Y. Wang and D. Zhao, The Design and Synthesis of a Synchronous and Distributed MAC
Protocol for Wireless Network-on-Chip, in Proc. IEEE/ACM International Conf. Computer-
Aided Design, 2007, pp. 612617.
[86] Y. Wang and D. Zhao, Design and Implementation of Routing Scheme for Wireless Network-
on-Chip, in Proc. IEEE International Symp. Circuits and Systems, ISCAS, 2007, pp. 1357
1360.
[87] C. Wu, S. Chai, Y. Li, and Z. Yang, Design of a Dual-Switching Mode NOC Router Mi-
croarchitecture, in Proc. International Conf. Electrical and Control Engineering (ICECE),
2010, pp. 27332736.
[88] C. Wu, Y. Li, S. Chai, and Z. Yang, Lottery Router: A Customized Arbitral Priority NOC
Router, Proc. International Conf. Computer Science and Software Engineering, vol. 3, pp.
411418, 2008.
[89] R. Wu, Y. Wang, and D. Zhao, A Low-Cost Deadlock-Free Design of Minimal-Table
Rerouted XY-Routing for Irregular Wireless NoCs, in Proc. Fourth ACM/IEEE Interna-
tional Symp, Networks-on-Chip (NOCS), 2010, pp. 199206.
[90] Y. Xinmin, S. P. Sah, S. Deb, P. P. Pande, B. Belzer, and H. Deukhyoun, A Wideband
Body-Enabled Millimeter-Wave Transceiver for Wireless Network-on-Chip, in Proc. IEEE
54th International Midwest Symp. Circuits and Systems (MWSCAS), 2011, pp. 14.
[91] M. Yang, T. Li, Y. Jiang, and Y. Yang, Fault-Tolerant Routing Schemes in RDT(2,2,1)/-
Based Interconnection Network for Networks-on-Chip Design, in Proc. 8th International
Symp. Parallel Architectures,Algorithms and Networks, ISPAN, 2005.
52
[92] S.-G. Yang, L. Li, Y.-A. Zhang, B. Zhang, and Y. Xu, A Power-Aware Adaptive Routing
Scheme for Network on a Chip, in Proc. 7th International Conf. ASIC, ASICON, 2007, pp.
13011304.
[93] B. Zafar, J. Draper, and T. M. Pinkston, Cubic Ring Networks: A Polymorphic Topology
for Network-on-Chip, in Proc. 39th International Conf. Parallel Processing (ICPP), 2010,
pp. 443452.
[94] J. Zhang, H. Gu, and Y. Yang, A High Performance Optical Network on Chip Based on Clos
Topology, Proc. 2nd International Conf. Future Computer and Communication (ICFCC),
vol. 2, pp. 6368, 2010.
[95] Y. Zhang and Q. Cao, RIPNoC: A Distributed Routing Scheme for Balancing on-Chip
Network Load, in Proc. Asia Pacic Conf. Postgraduate Research in Microelectronics and
Electronics (PrimeAsia), 2010, pp. 351355.
[96] Y. Zhang and D. Zhao, Design and Implementation of Routing Scheme for Wireless Network-
on-Chip, in Proc. IEEE International Symp. Circuits and Systems, May 2007, pp. 1357
1360.
[97] D. Zhao and Y. Wang, MTNET: Design and Optimization of a Wireless SoC Test Frame-
work, in Proc. IEEE international SoC Conf., Sept. 2006, pp. 239242.
[98] D. Zhao and Y. Wang, Feasibility Investigation of RF/Wireless Technology for Intra-/Inter-
Chip Communication, in Technical Report TR-007-8-002, Center for Advanced Computer
Studies, Univ. of Louisiana at Lafayette, 2007.
[99] D. Zhao and Y. Wang, SD-MAC: Design and Synthesis of a Hardware-Ecient Collision-
Free QoS-Aware Protocol for Wireless Network-on-Chip, 2008.
[100] D. Zhao, Y. Wang, J. Li, and T. Kikkawa, Design of Multi-Channel Wireless NoC to Improve
On-Chip Communication Capacity, in Proc. Fifth IEEE/ACM International Symposium on
Networks on Chip (NoCS), May 2011, pp. 177184.
[101] F. Zhao and J. X. Zeng, Simulated Annealing-Genetic Algorithm for Transit Network Opti-
mization, in Compo. in Civ. Engrg. 20, 57 (2006), DOI:10.1061/(ASCE)08873801(2006)20:
1(57).
[102] H. Zhou, X. Chen, D. S. Espinoza, A. Mickelson, and D. S. Filipovic, Nanoscale Optical
Dielectric Rod Antenna for On-Chip Interconnecting Networks, IEEE Trans. Microwave
Theory and Techniques, vol. 59, pp. 26242632, 2011.
[103] L. Zi-di and J. Lin, A Congestion Avoidance Communication Protocol for Network on
Chip, in Proc. IEEE International Conf. Intelligent Computing and Intelligent Systems
(ICIS), volume 2, 2010, pp. 8892.
53