Академический Документы
Профессиональный Документы
Культура Документы
Overview
Flynns taxonomy Classification based on the memory arrangement Classification based on communication Classification based on the kind of parallelism
Data-parallel Function-parallel
Flynns Taxonomy
The most universally excepted method of classifying computer systems Published in the Proceedings of the IEEE in 1966 Any computer can be placed in one of 4 broad categories SISD: Single instruction stream, single data stream SIMD: Single instruction stream, multiple data streams MIMD: Multiple instruction streams, multiple data streams MISD: Multiple instruction streams, single data stream
SISD
Instructions Data
SIMD
Applications: Image processing Matrix manipulations Sorting
SIMD Architectures
Fine-grained
Image processing application Large number of PEs Minimum complexity PEs Programming language is a simple extension of a sequential language
Coarse-grained
Each PE is of higher complexity and it is usually built with commercial devices Each PE has local memory
MIMD
MISD
Applications: Classification Robot vision
Flynn taxonomy
Advantages of Flynn Universally accepted Compact Notation Easy to classify a system (?) Disadvantages of Flynn Very coarse-grain differentiation among machine systems Comparison of different systems is limited Interconnections, I/O, memory not considered in the scheme
9
Shared memory I/O1 Interconnection network I/On PE1 M1 P1 PEn Mn Pn Interconnection network
PE1
PEn
Processors
10
Shared-memory multiprocessors
Uniform Memory Access (UMA) Non-Uniform Memory Access (NUMA) Cache-only Memory Architecture (COMA)
Memory is common to all the processors. Processors easily communicate by means of shared variables.
11
P1
Pn
$ Interconnection network
Mem
Mem
12
13
Mem
Mem
Interconnection network
M PE
M PE
M PE
Interconnection network
PE
PE
PE
15
Dynamic networks
16
17
Data-parallel architectures
Function-parallel architectures
Instruction-level PAs
Thread-level PAs
Process-level PAs
Pipelined VLIWs Superscalar Ditributed Shared Vector Associative SIMDs Systolic memory processors memory and neural architecture processors architecture architecture MIMD (multi(multi-computer) Processors)
18
References
1. Advanced Computer Architecture and Parallel Processing, by Hesham El-Rewini and Mostafa Abd-ElBarr, John Wiley and Sons, 2005. 2. Advanced Computer Architecture Parallelism, Scalability, Programmability, by K. Hwang, McGraw-Hill 1993. 3. Advanced Computer Architectures A Design Space Approach by Desco Sima, Terence Fountain and Peter Kascuk, Pearson, 1997.
19
Speedup
S = Speed(new) / Speed(old) S = Work/time(new) / Work/time(old) S = time(old) / time(new) S = time(before improvement) / time(after improvement)
20
Speedup
Time (one CPU): T(1)
Time (n CPUs): T(n) Speedup: S S = T(1)/T(n)
21
Amdahls Law
The performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used
22
Example
20 hours
A
must walk 200 miles
Walk 4 miles /hour Bike 10 miles / hour Car-1 50 miles / hour Car-2 120 miles / hour Car-3 600 miles /hour
23
Example
20 hours
A
must walk 200 miles
Walk 4 miles /hour Bike 10 miles / hour Car-1 50 miles / hour Car-2 120 miles / hour Car-3 600 miles /hour
: The fraction of the program that is naturally serial (1- ): The fraction of the program that is naturally parallel
25
S = T(1)/T(N) T(1)(1- ) N
T(N) = T(1) +
1 N S= = (1 ) + N + (1- ) N
26
Amdahls Law
27