Вы находитесь на странице: 1из 9

CAP656 - MULTICORE ARCHITECTURE

TERM PAPER ON
Term Paper Topic:

New Challenges: Power and Reliability

Submitted By:
Submitted To:
Vinay Awasthi
Reg. No. 11409182
MCA (Hons)
D1E06A10

Mr. Raj sir

Introduction
A multi-core processor is a single computing component with two or more independent actual
processing units (called "cores"), which are the units that read and execute program instructions.
The instructions are ordinary CPU instructions such as add, move data, and branch, but the
multiple cores can run multiple instructions at the same time, increasing overall speed for
programs amenable to parallel computing. Manufacturers typically integrate the cores onto a
single integrated circuit die (known as a chip multiprocessor or CMP), or onto multiple dies in a
single chip package.
Processors were originally developed with only one core. In the mid 1980s Rockwell
International manufactured versions of the 6502 with two 6502 cores on one chip as the
R65C00, R65C21, and R65C29, sharing the chip's pins on alternate clock phases. Other multicore processors were developed in the early 2000s by Intel, AMD and others. Multi-core
processors may have two cores (dual-core CPUs, for example AMD Phenom II X2 and Intel
Core Duo), four cores (quad-core CPUs, for example AMD Phenom II X4, Intel's i5 and i7
processors), six cores (hexa-core CPUs, for example AMD Phenom II X6 and Intel Core i7
Extreme Edition 980X), eight cores (octo-core CPUs, for example Intel Xeon E7-2820 and
AMD FX-8350), ten cores (for example, Intel Xeon E7-2850), or more.
A multi-core processor implements multiprocessing in a single physical package. Designers
may couple cores in a multi-core device tightly or loosely. For example, cores may or may not
share caches, and they may implement message passing or shared memory inter-core
communication methods. Common network topologies to interconnect cores include bus, ring,
two-dimensional mesh, and crossbar. Homogeneous multi-core systems include only identical
cores, heterogeneous multi-core systems have cores that are not identical. Just as with singleprocessor systems, cores in multi-core systems may implement architectures such as
superscalar, VLIW, vector processing, SIMD, or multithreading. Multi-core processors are
widely used across many application domains including general-purpose, embedded, network,
digital signal processing (DSP), and graphics.
The improvement in performance gained by the use of a multi-core processor depends very
much on the software algorithms used and their implementation. In particular, possible gains are
limited by the fraction of the software that can be run in parallel simultaneously on multiple
cores; this effect is described by Amdahl's law. In the best case, so-called embarrassingly
parallel problems may realize speedup factors near the number of cores, or even more if the
problem is split up enough to fit within each core's cache(s), avoiding use of much slower main
system memory. Most applications, however, are not accelerated so much unless programmers
invest a prohibitive amount of effort in re-factoring the whole problem. The parallelization of
software is a significant ongoing topic of research.

Introduction to Multi-Core systems challenges


One of the most important trend of increasing the speed of processor to get a boost in
performance is Multi-core. Multi-core processors are the new direction manufacturers are
focusing on. A multi-core processor has many advantages especially for those looking to boost
their multitasking computing power of system. These kinds of processors provide few complete
execution cores instead of one, each with an independent interface to the front side bus. Since
each core has its own cache, so the only one operating system which has sufficient resources
and provides a noticeable improvement to multitasking can handle intensive tasks in parallel.
But in fact there are some pluses and minuses when we add new cores. In this term paper, our
main goal is describe some of the important challenges of multi-core.
Power challengesWith the development of the semiconductor technology, integrating billions of transistors on a
single die has been possible and the complexity of the SoC is increasing. Further, more and
more processors and IP cores can be implemented on a single die to build a multiprocessor SoC.
Along with this trend, the electronic devices are becoming more and more sensitive to external
disturbs such as soft errors, and fault-tolerant architecture is always used to obtain the
reliability. Since the very beginning of integrated circuits development, processors were
invented with ever-increasing clock frequencies and sophisticated in-build optimization
strategies. Due to physical limitations, this 'free lunch' of speedup has come to an end. Physical
constraints put up multiple barriers in achieving high performance computing within power
budgets.
Having multiple cores on a single chip gives rise to some problems and challenges. Power
and temperature management are two concerns that can increase exponentially with the
addition of multiple cores. Memory/cache coherence is another challenge, since all designs
discussed above have distributed L1 and in some cases L2 caches which must be coordinated.
And finally, using a multi-core processor to its full potential is another issue. If programmers do
not write applications that take advantage of multiple cores there is no gain, and in some cases
there is a loss of performance. Application need to be written so that different parts can be run
concurrently (without any ties to another part of the application that is being run
simultaneously).

Power and temperature

If two cores were placed on a single chip without any modification, the chip would, in theory,
consume twice as much power and generate a large amount of heat. In the extreme case, if a
processor overheats your computer may even combust. To account for this each design above
runs the multiple cores at a lower frequency to reduce power consumption. To combat
unnecessary power consumption many designs also incorporate a power control unit that has
the authority to shut down unused cores or limit the amount of power. By powering off unused
cores and using clock gating the amount of leakage in the chip is reduced. To lessen the heat
generated by multiple cores on a single chip, the chip is architected so that the number of hot
spots does not grow too large and the heat is spread out across the chip. As seen in Figure 1, the
majority of the heat in the CELL processor is dissipated in the Power Processing Element and
the rest is spread across the Synergistic Processing Elements. The CELL processor follows a
common trend to build temperature monitoring into the system, with its one linear sensor and
ten internal digital sensors

(Figure 1.)

Reliability challenges
An emerging problem facing future high performance multi-core processors is transient faults
caused by radiation, noise and other factors. These faults will likely make future multi-core
processors less reliable as chip features shrink and the number of cores increase Transient faults
in chip multiprocessors are an emerging problem as power and performance demands push
semiconductor technologies to their limits. As transistors shrink with each new generation of
many- core processors, allowing for more cores on each chip, soft errors will increase
considerably. Soft errors and noise are expected to double with each new processor generation.
Increases in soft error rates range from several times to several orders of magnitudes greater in
the next few years, so eventually, core reliability may become a critical issue. In addition, as
high performance cores run at higher temperatures, they gradually suffer from wear out, also
contributing to increased failure rates.
Reliability
An emerging problem facing future high performance multi-core processors is transient faults
caused by radiation, noise and other factors. These faults will likely make future multi-core
processors less reliable as chip features shrink and the number of cores increase. To address this
problem, we propose a new and practical systems approach of managing and allocating
reliability according to software process requirements. The asymmetric multi-core architecture
is based on cores with differing reliabilities. Critical and non-critical software components are
identified and matched with the higher reliability cores. We show that by using asymmetrically
reliable cores the overall system failure rate can be reduced by several times when critical
processes can be isolated and executed by higher reliability cores, while offering the same or
better overall performance, power utilization and chip area as symmetric cores.
Cache Coherence
Cache coherence is a concern in a multi-core environment because of distributed L1 and L2
cache. Since each core has its own cache, the copy of the data in that cache may not always be
the most up-to-date version. For example, imagine a dual-core processor where each core
brought a block of memory into its private cache. One core writes a value to a specific location;
when the second core attempts to read that value from its cache it will not have the updated
copy unless its cache entry is invalidated and a cache miss occurs. This cache miss forces the
second cores cache entry to be updated. If this coherence policy was not in place garbage data
would be read and invalid results would be produced, possibly crashing the program or the
entire computer.

Transient faults
Transient faults in chip multiprocessors are an emerging problem as power and performance
demands push semiconductor technologies to their limits. As transistors shrink with each new
generation of many- core processors, allowing for more cores on each chip, soft errors will
increase considerably. Soft errors and noise are expected to double with each new processor
generation. Increases in soft error rates range from several times to several orders of magnitudes
greater in the next few years, so eventually, core reliability may become a critical issue. In
addition, as high performance cores run at higher temperatures, they gradually suffer from wear
out, also contributing to increased failure rates. Furthermore, embedded multi-core processors
in high noise and high radiation environments such as in aircraft and machinery require high
reliability. For these reasons, future multi-core systems must be designed to mitigate inevitable
transient faults. As software becomes more complex with many dependencies, a hardware fault
on a single core can affect an entire system. However, many-core designs offer us the
opportunity to isolate failures if risk is managed effectively. We introduce an asymmetric
multiprocessor and parallel software architecture designed to manage transient fault risks and
mitigate the effects efficiently.
Core failures can be divided into two types, permanent and transient. Permanent failures are
often caused by manufacturing defects, which are found and mitigated during production
testing. Transient faults are caused by noise and radiation and induce soft errors that are much
harder to manage and mitigate. There are hardware and software approaches to mitigate these
faults. Faults can be minimized in hardware by adding fault tolerant logic and hardware
redundancies. However, since chip area is limited on single chip multiprocessors, limited area is
available for redundant circuits. A software approach is for programmers to write fault tolerant
software that does not require high reliability. We use both hardware and software techniques to
improve system dependability.

Advantages of multi-Core Systems

The proximity of multiple CPU cores on the same die allows the cache coherency circuitry to
operate at a much higher clock-rate than is possible if the signals have to travel off-chip.
Combining equivalent CPUs on a single die significantly improves the performance of cache
snoop (alternative: Bus snooping) operations. Put simply, this means that signals between
different CPUs travel shorter distances, and therefore those signals degrade less. These higherquality signals allow more data to be sent in a given time period, since individual signals can be
shorter and do not need to be repeated as often.
Assuming that the die can physically fit into the package, multi-core CPU designs require much
less printed circuit board (PCB) space than do multi-chip SMP designs. Also, a dual-core
processor uses slightly less power than two coupled single-core processors, principally because
of the decreased power required to drive signals external to the chip. Furthermore, the cores
share some circuitry, like the L2 cache and the interface to the front side bus (FSB). In terms of
competing technologies for the available silicon die area, multi-core design can make use of
proven CPU core library designs and produce a product with lower risk of design error than
devising a new wider core-design. Also, adding more cache suffers from diminishing returns.
Multi-core chips also allow higher performance at lower energy. This can be a big factor in
mobile devices that operate on batteries. Since each and every core in multi-core is generally
more energy-efficient, the chip becomes more efficient than having a single large monolithic
core. This allows higher performance with less energy. The challenge of writing parallel code
clearly offsets this benefit.

Disadvantages of multi-Core Systems

Maximizing the usage of the computing resources provided by multi-core processors requires
adjustments both to the operating system (OS) support and to existing application software.
Also, the ability of multi-core processors to increase application performance depends on the
use of multiple threads within applications. Integration of a multi-core chip drives chip
production yields down and they are more difficult to manage thermally than lower-density
single-core designs. Intel has partially countered this first problem by creating its quad-core
designs by combining two dual-core on a single die with a unified cache, hence any two
working dual-core dies can be used, as opposed to producing four cores on a single die and
requiring all four to work to produce a quad-core. From an architectural point of view,
ultimately, single CPU designs may make better use of the silicon surface area than
multiprocessing cores, so a development commitment to this architecture may carry the risk of
obsolescence. Finally, raw processing power is not the only constraint on system performance.
Two processing cores sharing the same system bus and memory bandwidth limits the real-world
performance advantage. It has been claimed that if a single core is close to being memorybandwidth limited, then going to dual-core might give 30% to 70% improvement; if memory
bandwidth is not a problem, then a 90% improvement can be expected; however, Amdahl's law
makes this claim dubious. It would be possible for an application that used two CPUs to end up
running faster on one dual-core if communication between the CPUs was the limiting factor,
which would count as more than 100% improvement.

Conclusion
To address the challenges in performance, power and future technologies, innovations on
computer architecture and design are needed; and multi-core systems are one of the most, or the
most promising technology. As was also pointed out by a computer architecture group at EECS
department of UC Berkeley the shift toward increasing parallelism is not a triumphant stride
forward based on breakthroughs in novel software and architectures for parallelism; instead,
this plunge into parallelism is actually a retreat from even greater challenges that thwart
efficient silicon implementation of traditional uniprocessor architectures. Deep submicron
fabrication technologies enable very high levels of integration such as a dual-core chip with 1.7
billion transistors, thus reaching a key milestone in the level of circuit complexity possible on a
single chip. A highly promising approach to efficiently use these circuit resources is the
integration of multiple processors onto a single chip to achieve higher performance through
parallel processing, which is called a multi-core system or a chip multiprocessor. Multi-core
systems can provide high energy efficiency since they can allow the clock frequency and supply
voltage to be reduced together to dramatically reduce power dissipation during periods when
full rate computation is not needed.