198313535

COMPUTER
ARCHITECTURE
AND
PARALLEL
PROCESSING
Kai Hwang
University of Southern California
Faye A. Briggs
Rice University
Technische Hochschule Darmstadt

FACHBEREICH INFORMATIK
B I B L I O T H E K .
3?CD t\
Inventar-Nr.:
Sachgebiete:
9J7.TS.V.
O.Hv
Standort:
McGraw-Hill Book Company

New York
Panama
St. Louis San Francisco Auckland Bogota Hamburg

London Madrid Mexico Montreal New Delhi
Paris Sao Paulo Singapore Sydney Tokyo Toronto
CONTENTS
Preface
XV
Chapter 1 Introduction to Parallel Processing

1.1
1.2
1.3
1.4
1.5
1.6
Chapter 2
2.1
Evolution of Computer Systems

1.1.1 Generations of Computer Systems
1.1.2 Trends towards Parallel Processing
Parallelism in Uniprocessor Systems
.2.1 Basic Uniprocessor Architecture
.2.2 Parallel Processing Mechanisms
.2.3 Balancing of Subsystem Bandwidth
.2.4 Multiprogramming and Time Sharing
Parallel Computer Structures
.3.1 Pipeline Computers
.3.2 Array Computers
.3.3 Multiprocessor Systems
.3.4 Performance of Parallel Computers
.3.5 Dataflow, and New Concepts
Architectural Classification Schemes
.4.1 Multiplicity of Instruction-Data Streams
.4.2 Serial versus Parallel Processing
.4.3 Parallelism versus Pipelining
Parallel Processing Applications
.5.1' Predictive Modeling and Simulations
.5.2 Engineering Design and Automation
1.5.3 Energy Resources Exploration
1.5.4 Medical, Military, and Basic Research
Bibliographic Notes and Problems
11
13
16
20
20
22
25
27
29
32
32
35
37
40
42
44
46
48
49
Memory and Input-Output Subsystems
52
Hierarchical Memory Structure

2.1.1 Memory Hierarchy
2.1.2 Optimization of Memory Hierarchy
2.1.3 Addressing Schemes for Main Memory
52
52
56
58
X CONTENTS
2.2
2.3
2.4
2.5
2.6
Chapter 3
3.1
3.2
3.3
3.4
3.5
Chapter 4
4.1
Virtual Memory System

2.2.1 The Concept of Virtual Memory
2.2.2 Paged Memory System
2.2.3 Segmented Memory System
2.2.4 Memory with Paged Segments
Memory Allocation and Management
2.3.1 Classification of Memory Policies
2.3.2 Optimal Load Control
2.3.3 Memory Management Policies
Cache Memories and Management
2.4.1 Characteristics of Cache Memories
2.4.2 Cache Memory Organizations
2.4.3 Fetch and Main Memory Update Policies
2.4.4 Block Replacement Policies
Input-Output Subsystems
2.5.1 Characteristics of I/O Subsystem
2.5.2 Interrupt Mechanisms and Special Hardware
2.5.3 I/O Processors and I/O Channels
60
61
65
71
77
80
80
86
91
98
98
102
113
115
118
118
123
128
141
Principles of Pipelining and Vector Processing
145
Pipelining: An Overlapped Parallelism

3.1.1 Principles of Linear Pipelining
3.1.2 Classification of Pipeline Processors
3.1.3 General Pipelines and Reservation Tables
3.1.4 Interleaved Memory Organizations
Instruction and Arithmetic Pipelines
3.2.1 Design of Pipelined Instruction Units
3.2.2 Arithmetic Pipelines Design Examples
3.2.3 Multifunction and Array Pipelines
Principles of Designing Pipelined Processors
3.3.1 Instruction Prefetch and Branch Handling
3.3.2 Data Buffering and Busing Structures
3.3.3 Internal Forwarding and Register Tagging
3.3.4 Hazard Detection and Resolution
3.3.5 Job Sequencing and Collision Prevention
*3.3.6 Dynamic Pipelines and Reconfigurability
Vector Processing Requirements
3.4.1 Characteristics of Vector Processing
*3:4.2 Multiple Vector task Dispatching
3.4.3 Pipelined Vector Processing Methods
Pipeline Computers a n d Vectorization Methods

The Space of Pipelined Computers
4.1.1 Vector Supercomputers
4.1.2 Scientific Attached Processors
145
146
151
154
156
164
164
170
181
187
187
193
196
200
203
208
212
213
218
226
229
233
233
234
235
CONTENTS XI
4.2
4.3
4.4
4.5
4.6
Early Vector Processors

4.2.1 Architectures of Star-100 and TI-ASC
4.2.2 Vector Processing in Streaming Mode
Scientific Attached Processors
4.3.1 The Architecture of AP-120B
4.3.2 Back-end Vector Computations
4.3.3 FPS-164, IBM 3838 and Datawest MATP
Recent Vector Processors
4.4.1 The Architecture of Cray-1
4.4.2 Pipeline Chaining and Vector Loops
4.4.3 The Architecture of Cyber-205
4.4.4 Vector Processing in Cyber-205 and CDC-NASF
4.4.5 Fujitsu VP-200 and Special Features
Vectorization and Optimization Methods
4.5.1 Language Features in Vector Processing
4.5.2 Design of Vectorizing Compilers
4.5.3 Optimization of Vector Operations
*4.5.4 Performance Evaluation of Pipelined Computers
237
237
245
249
249
255
258
264
264
271
280
285
293
301
301
305
308
314
320
Chapter 5 Structures and Algorithms for Array Processors

5.1.
SIMD Array Processors

5.1.1 SIMD Computer Organizations
5.1.2 Masking and Data Routing Mechanisms
5.1.3 Inter-PE Communications
5.2 SIMD Interconnection Networks
5.2.1 Static versus Dynamic Networks
5.2.2 Mesh-Connected Illiac Network
5.2.3 Cube Interconnection Networks
5.2.4 Barrel Shifter and Data Manipulator
5.2.5 Shuffle-Exchange and Omega Networks
5.3 Parallel Algorithms for Array Processors
5.3.1 SIMD Matrix Multiplication
5.3.2 Parallel Sorting on Array Processors
5.3.3 SIMD Fast Fourier Transform
5.3.4 Connection Issues for SIMD Processing
5.4 Associative Array Processing
5.4.1 Associative Memory Organizations
5.4.2 Associative Processors (PEPE and STARAN)
*5.4.3 Associative Search Algorithms
5.5 Bibliographic Notes and Problems
''
--
325
325
326
328
332
333
334
339
342
345
350
355
355
361
367
373
" 374
375
380
385
388
Chapter 6 S I M D Computers and Performance Enhancement 393

6.1
The Space of SIMD Computers

6.1.1 Array and Associative Processors
6.1.2 SIMD Computer Perspectives
393
394
396
Xll CONTENTS
6.2
The Illiac-IV and the BSP Systems

6.2.1 The Illiac-IV System Architecture
6.2.2 Applications of the Illiac-IV
6.2.3 The BSP System Architecture
6.2.4 The Prime Memory System
6.2.5 The BSP Fortran Vectorizer
6.3 The Massively Parallel Processor
6.3.1 The MPP System Architecture
6.3.2 Processing Array, Memory, and Control
6.3.3 Image Processing on the MPP
6.4 Performance Enhancement Methods
6.4.1 Parallel Memory Allocation
6.4.2 Array Processing Languages
*6.4.3 Performance Analysis of Array Processors
*6.4.4 Multiple-SIMD Computer Organization
Chapter 7
7.1
7.2
7.3
7.4
7.5
7.6
Chapter 8
8.1
8.2
398
399
402
410
414
417
422
423
426
430
434
434
440
445
448
454
Multiprocessor Architecture a n d P r o g r a m m i n g
Functional Structures
7.1.1 Loosely Coupled Multiprocessors
7.1.2 Tightly Coupled Multiprocessors
7.1.3 Processor Characteristics for Multiprocessing
Interconnection Networks
7.2.1 Time Shared or Common Buses
7.2.2 Crossbar Switch and Multiport Memories
7.2.3 Multistage Networks for Multiprocessors
*7.2.4 Performance of Interconnection Networks
Parallel Memory Organizations
7.3.1 Interleaved Memory Configurations
*7.3.2 Performance Tradeoffs in Memory Organizations
7.3.3 Multicache Problems and Solutions
Multiprocessor Operating Systems
7.4.1 Classification of Multiprocessor Operating Systems
7.4.2 Software Requirements for Multiprocessors
7.4.3 Operating System Requirements
Exploiting Concurrency for Multiprocessing
7.5.1 Language Features to Exploit Parallelism
7.5.2 Detection of Parallelism in Programs
*7.5.3 Program and Algorithm Restructuring
Multiprocessing C o n t r o l a n d Algorithms
Interprocess Communication Mechanisms
8.1.1 Process Synchronization Mechanisms
8.1.2 Synchronization with Semaphores
8.1.3 Conditional Critical Sections and Monitors
System Deadlocks and Protection
8.2.1 System Deadlock Problems
459
459
459
460
478
481
481
487
492
502
508
508
513
517
525
526
528
531
533
533
541
545
551
557
557
557
565
572
577
577
CONTENTS Xlll
8.2.2 Deadlock Prevention and Avoidance

8.2.3 Deadlock Detection and Recovery
8.2.4 Protection Schemes
8.3 Multiprocessor Scheduling Strategies
8.3.1 Dimensions of Multiple Processor Management
8.3.2 Deterministic Scheduling Models
*8.3.3 Stochastic Scheduling Models
8.4 Parallel Algorithms for Multiprocessors
8.4.1 Classification of Parallel Algorithms
8.4.2 Synchronized Parallel Algorithms
8.4.3 Asynchronous Parallel Algorithms
*8.4.4 Performance of Parallel Algorithms
Chapter 9
9.1
9.2
9.3
9.4
9.5
9.6
9.7
580
582
583
590
590
596
606
613
614
616
622
628
637
Example Multiprocessor Systems

The Space of Multiprocessor Systems
9.1.1 Exploratory Systems
9.1.2 Commercial Multiprocessors
The C.mmp Multiprocessor System
9.2.1 The C.mmp System Architecture
9.2.2 The Hydra Operating System
*9.2.3 Performance of the C.mmp
The S-l Multiprocessor
9.3.1 -The S-l System Architecture
9.3.2 Multiprocessing Uniprocessors
9.3.3 The S-l Software Development
The HEP Multiprocessor
1
9.4.1 The HEP System Architecture
9.4.2 Process Execution Modules
9.4.3 Parallel Processing'on the HEP
Mainframe Multiprocessor Systems
9.5.1 IBM 370/168 MP, 3033, and 3081
9.5.2 Operating System for IBM Multiprocessors
9.5.3 Univac 1100/80 and 1100/90 Series
9.5.4 The'Tandem Nonstop System
- The Cray X-MP and Cray 2
9.6.1 Cray X-MP System Architecture
9.6.2 Multitasking on Cray X-MP
*9.6.3 Performance of Cray X-MP
643
"~
Chapter 10 Data Flow Computers and VLSI Computations

10.1
10.2
Data-Driven Computing and Languages

10.1.1 Control-Flow versus Data Flow Computers
10.1.2 Data Flow Graphs and Languages
10.1.3 Advantages and Potential Problems
Data Flow Computer Architectures
643
643
644
645
645
650
654
658
659
661
668
669
669
674
680
684
684
693
694
705
714
714
1\1
721
728
732
732
733
740
745
748
XIV CONTENTS
10.2.1 Static Data Flow Computers

10.2.2 Dynamic Data Flow Computers
10.2.3 Data-Flow Design Alternatives
10.3 VLSI Computing Structures
10.3.1 The Systolic Array Architecture
*10.3.2 Mapping Algorithms into Systolic Arrays
10.3.3 Reconfigurable Processor Array
10.4 VLSI Matrix Arithmetic Processors
10.4.1 VLSI Arithmetic Modules
10.4.2 Partitioned Matrix Algorithms
10.4.3 Matrix Arithmetic Pipelines
*10.4.4 Real-Time Image Processing
748
755
763
768
769
774
779
788
788
790
798
803
807
Bibliography
813
Index
833

198313535

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

198313535

Загружено:

Авторское право:

Доступные форматы

COMPUTER

Technische Hochschule Darmstadt

McGraw-Hill Book Company

St. Louis San Francisco Auckland Bogota Hamburg

Chapter 1 Introduction to Parallel Processing

Evolution of Computer Systems

Memory and Input-Output Subsystems

Hierarchical Memory Structure

Virtual Memory System

Principles of Pipelining and Vector Processing

Pipelining: An Overlapped Parallelism

Pipeline Computers a n d Vectorization Methods

Early Vector Processors

Chapter 5 Structures and Algorithms for Array Processors

SIMD Array Processors

Chapter 6 S I M D Computers and Performance Enhancement 393

The Space of SIMD Computers

The Illiac-IV and the BSP Systems

8.2.2 Deadlock Prevention and Avoidance

Example Multiprocessor Systems

Chapter 10 Data Flow Computers and VLSI Computations

Data-Driven Computing and Languages

10.2.1 Static Data Flow Computers

Вам также может понравиться