Вы находитесь на странице: 1из 43

A Quick Look into Parallel Computer

Architectures Multicore, Manycore and


GPGPU

By:
Dr. Uma B.
Malnad College of Engineering
Hassan
Karnataka

Take Home at the end of this


session :

What is a Parallel Computer ??

Why and What is a Multi-core Processor?

Many-core Architecture

Why GPU ??

Applications that benefit from these Processors

Few thoughts about Effective Teaching of


Course

a Technical

How to Run Applications Faster ?


There are 3 ways to improve performance:

Work Harder

Work Smarter

Get Help

Computer Analogy

Using faster hardware

Optimized algorithms and techniques used to solve


computational tasks

Multiple computers to solve a particular task

A Parallel Computer

Super Computer

Titan Super Computer - Oak Ridge


National Laboratory

18,688nodes

Each containing a16-coreAMD Opteron

Nvidia Tesla K20X GPU with 6GBGDDR5ECC memory

A total of 2,99,008 processor cores

A total of 693.6TiB of CPU and GPU RAM

Cray Linux Operating System

Why Parallel Processing?

Computation

requirements

are

ever

increasing

--

visualization, distributed databases, simulations, scientific


prediction (earthquake), etc.

Sequential architectures reaching physical limitation


hitting a wall

Need for Multi-Core Processors


Power Wall + Memory Wall + ILP Wall =
Brick Wall for Serial Computing!

Multi-Core Processors

Multi-core Processor
A

processing system composed of two or more

independent cores (or CPUs) on a single chip.

Two-core Processor

Quad-core Processor

Intel Core i& Block Diagram

Simultaneous Multithreading
(SMT)
Permits multiple independent threads to execute
SIMULTANEOUSLY on the SAME core
Weaving together multiple threads
on the same core
Example: if one thread is waiting for a floating point
operation to complete, another thread can use the
integer units.

A Peep into ManyCore


Processor

What is Many-Core?
The terms many-core and massively
multi-core are sometimes used to
describe multi-core architectures with
an especially high number of cores(tens
or hundreds)
-Andras Vajda

The Old Challenge: CPU-on-a-chip

20 MIPS CPU
in 1987

Few thousand gates

The Opportunity: Billions of


Transistors

Old
CPU:

What to do with all these transistors?

Replace Long Wires with Routed


Interconnect
Ctrl

[IEEE Computer 97]

ALU
ALU
ALU
ALU

ALU
ALU
ALU

Bypass Net
ALU

ALU
ALU
ALU
ALU

ALU
ALU
ALU
ALU

From Centralized Clump of CPUs

RF

ALU

ALU

ALU

ALU

ALU

ALU

To Distributed ALUs, Routed


Bypass Network

Scalar Operand Network (SON) [TPDS 2005]

Distributed Everything + Routed


Interconnect Tiled Multicore
ALU

ALU

ALU

ALU

ALU

ALU

Each tile is a processor, so programmable

Tilera 100-core Processor

SCC Feature Set

First Si with 48 iA cores on a single die


6x4 mesh, 2 cores per tile
Power envelope 125W, Core @ 1GHz, Mesh @ 2GHz
45 nm, 1.3 B transistors
16 to 64 GB DRAM using 4 DDR3 MC
Message passing architecture

No coherent shared memory

Proof of concept for scalable many-core solution


Next generation 2D mesh interconnect

Bisection B/W 1.5Tb/s to 2Tb/s, avg.power 6W to 12 W


Fine grain dynamic power management

SCC System Overview

Voltage and Frequency Islands

A Glance at
GPGPU Processors

CPU Vs. GPU


A CPU consists of a few cores optimized for sequential
serial processing while a GPU has a massively parallel
architecture consisting of thousands of smaller, more
efficient cores designed for handling multiple tasks
simultaneously.

GPU-accelerated
computing
offers
unprecedented
application performance by offloading compute-intensive
portions of the application to the GPU, while the
remainder of the code still runs on the CPU.
From a user's perspective, applications simply run
significantly faster.

CPU vs. GPU


Adam and Jamie Paint the Mona Lisa in 80 Milliseconds! (HD).mp4

Central Idea in GPU

Add ALUs to a Single Core.

Amortize
instruction

cost/complexity

of

managing

an

stream across many ALUs

Interleave processing of many fragments on a


single core

to avoid stalls caused by high

latency operations.

16 SM
24,000 CUDA threads

Power of GPU

The arithmetic power of the GPU is a result of its


highly specialised architecture.

GPUs are great for data parallelism

CUDA (Compute Unified Device Architecture) is


NVIDIAs GPU architecture featured in the GPU
cards

GPGPU

GPU functionality has, traditionally, been very


limited to graphics

GPGPU refers to usage of GPU for computation in


applications for work other than graphical output
which is due to the irresistible qualities of the GPU.

GPGPU is the use of a GPU (graphics processing unit)


to do general purpose scientific and engineering
computing.

Tunnel Vision by Experts

I think there is a world market for maybe five computers.


Thomas Watson, chairman of IBM, 1943.

There is no reason for any individual to have a computer in their


home
Ken Olson, president and founder of Digital Equipment
Corporation, 1977.

640K [of memory] ought to be enough for anybody.


Bill Gates, chairman of Microsoft,1981.

On several recent occasions, I have been asked whether parallel


computing will soon be relegated to the trash heap reserved for
promising technologies that never quite make it.
Ken Kennedy, CRPC Directory, 1994

Some Applications of Parallel Computing

Computational Structural
Mechanics
Bio-Informatics and Life
Sciences
Computational
Electromagnetics and
Electrodynamics
Computational
Finance

Few More Applications


Computational Fluid Dynamics

Data Mining, Analytics, and Database

Imaging and Computer Vision

Medical Imaging

Вам также может понравиться