Академический Документы
Профессиональный Документы
Культура Документы
Introduction
A common way to increase parallelism among instructions is to exploit data parallelism among independent iterations of a loop SIMD architectures can exploit significant data-level parallelism for:
matrix-oriented scientific computing media-oriented image and sound processors
Vector Architectures
Read sets of data elements into vector registers Operate on those registers Disperse the results back into memory Example: VMIPS Improvements
Multiple Lanes Gather-Scatter Memory Addressing
CSL718
05-Apr-12
SIMD Extensions
Media applications operate on data types narrower than the native word size Limitations, compared to vector instructions:
Number of data operands encoded into op code No sophisticated addressing modes (stride, scatter-gather) No mask registers
CSL718
05-Apr-12
Offers higher potential performance than traditional multicore computers. Heterogeneous execution model
CPU is the host, GPU is the device
Develop a C-like programming language for GPU Unify all forms of GPU parallelism as CUDA (Compute Unified Device Architecture) thread Programming model is Single Instruction Multiple Thread
CSL718
05-Apr-12
CSL718
Implicitly using branch synchronization markers and internal stack to save, complement and restore masks.
05-Apr-12
CSL718
05-Apr-12
Memory Latency is hidden by paying latency once per load/store instructions in Vector Architecture. GPU hides it using Multithreading. Conditional Branch Mechanism of GPU handles StripMining problem of Vector Architectures by iterating the loop until all the SIMD lanes reach the loop bound.
CSL718
05-Apr-12
CSL718
Scalar processor and Multimedia instructions are separated by an I/O bus in GPUs with separate main memories.
05-Apr-12
Also, Multimedia SIMD instructions do not support scatter-gather memory accesses. In short it can be said that GPUs are multithreaded SIMD processors with more number of lanes, processors and better hardware for multi-threading.
CSL718
05-Apr-12
Thank You
CSL718
05-Apr-12