Writing Final Myer

Jakob Myer ENGR1202 Writing Assignment Final Due 4/13/2017
Jmyer1@uncc.edu Page 1 of 6
Computer Hardware, CPUs, and GPUs

Personal Interest
I have an interest in computer hardware components, specifically CPUs and GPUs. This interest
developed from my history of playing video games on various systems, as well as a summer
camp held at Wake Tech Community College that taught the basics of a computer system and
how to assemble one. Aside from the components themselves, I am also curious as to how their
manufacturers design components to maintain compatibility with other parts, and the approaches
they use to improve their products. As components are designed to become smaller and more
powerful at the same time, their power consumption, heat production, and other properties need
to be addressed in some way so that they do not become serious issues in later generations of
computer hardware.
CPU Basics
A central processing unit (CPU) is a circuit that accepts and processes instructions from a
program or peripheral device, such as a keyboard [1, 2]. These instructions are assigned
addresses and isolated before being transferred to the computers random access memory
(RAM), where they are further transferred into the CPUs control unit. The control unit
translates and implements the instructions before sending them to the arithmetic logic unit
(ALU), which processes calculations and comparisons. The ALU can perform mathematical
operations and conditional tests, storing data in the CPUs register or cache as it processes the
calculations. The chain of data transfer is illustrated in Figure 1.
Figure 1: Diagram of a CPU

The registers serve as temporary memory storage for the CPU, and while they lack in number
and storage size, they can be read faster than disk or RAM memory [3, 4]. However, their
contents are lost if the CPU loses power, making them unsuitable for long-term or important
information storage. Once calculations are complete, the resulting data is sent to the RAM or
stored within the register for later use, and this data is sent to its respective component when the
instructions are carried out [3].
Some of the devices in a computer are linked by buses, bundles of data lines which facilitate data
transfer [3, 4]. These buses exist in both internal and external formats, transferring data to and
from the ALU and connecting the CPU to memory and input/output controllers, respectively.
Devices attached to the buses can initiate bus transfers, and are called masters; devices that
passively await requests are called slaves. Wider buses can facilitate faster and larger data
transfers, but this comes at the cost of taking up more space in both wires and connectors [2]. A
multiplexed bus separates the data transfers into multiple parts, reducing the number of lines by
using the same lines for data and addressing, but the overall bus performance will be reduced.
Intels CPU Design
Intel was the creator of the first microproccessor in 1971, and it presently operates as the worlds
largest manufacturer of microprocessor chips 99 percent of all chips used in servers are Intels
[6]. Developing a new microprocessor is a process that can take five or more years and over $10
billion to complete, and the improvements made in a processor are tied to the size of the
transistors, which store single bits of information. Intel shrinks the size of its transistors
approximately every three years, allowing for more of them to be implemented within a
microproccessor with the same area as the previous model. This is a process known as Moores
Law, which states that transistor count doubles every two or so years, as seen in Figure 2 [10].
Figure 2: Plot of microprocessor transistor count from 1970 2010, in accordance with
Moores Law
Presently, the continous shrinking of the transistors is posing two problems: controlling for
quantum tunneling, and designing increasingly dense chip layouts [6]. As transistors decrease
further in size by mere nanometers, the electrons in the chip run the risk of jumping through
transistors, even those which are off, which should logically block the movement. Intels Xeon
chips solve this problem by building tower-shaped transistors known as fin-shaped field effect
transistors (FinFETs), which avoid the tunneling phenomenon for present transistor standards.
The other issue is the increasingly complex layout of the transistors and their interconnects on
the chip, and while transistors can be shrunk and their efficiency improved, the copper wires that
connect them cannot without compromising their ability to carry current. As copper is one of the
most conductive materials available, improvements must be made in areas such as the insulators,
which dampen the current flow. Utilizing air itself as an insulator improved the current speed of
two wire layers in the Xeon E5 chip by 10 percent, but the the challenge of improving chip
power as transistors become smaller and interconnects more dense remains.
The size of Intels chips is approaching 5 nanometers, after which it is believed that further
downscaling will prove impossible and Moores Law will collapse [6]. Intel is looking into
solutions ranging from extreme ultraviolet light to quantum computing (replacing transistors
with atomic particles), but they have only two generations of chips beyond the E5 series left to
develop before that threshhold is reached. The future of transistors and microproccessors
depends on the success of entirely new materials and chip designs.
GPU Basics
Graphics processing units (GPUs) are collections of processing resources that prioritize parallel
processing instead of the single-task processes of a CPU, outperforming it in raw computational
power [5, 7]. As dedicated graphics systems, GPUs are employed in tasks requiring images
and/or 3D modeling, and tasks that require real-time graphics require it to render scenes 60 or so
times per second. The task of rendering graphics is organized into a graphics pipeline (shown in
Figure 3) that details the stages of image rendering from generating the points, lines, and
triangles known as primitives to controlling which pixels are seen by the virtual camera (the
perspective of the viewer).
The vector, primitive, and fragment processing stages of the graphics pipeline are governed by
shader functions, which are coded in languages that support impressive control-flow constructs
and data types, but no primitives related to explicit parallel execution [5]. These shader
functions can process the data records of an entity from a single input, rendering objects
independently of other entity processing. This allows all details in an image to be rendered at the
same time.
Figure 3: Diagram of the graphics pipeline
The GPUs ability to process so many shader functions simultaneously lies within its design,
which places an emphasis on processing cores [5]. A single core processor can process one
stream of processor instructions, known as execution contexts, at a time. A multicore processor
can run multiple execution contexts in parallel, and even greater processing power is achieved by
cores that contain multiple ALUs. Through a method known as single instruction, multiple data
(SIMD) processing, the ALUs within a core can perform identical operations on different pieces
of data, improving the rate of data execution and the power and space efficiency of the core.
However, the flow of a stream will stall if a stream instruction cannot execute without
outstanding information, putting it on standby until the information is available. To avoid
wasting cycles, GPUs accept more execution contexts than they can simultaneously process, and
the ALUs will process instructions from other runnable flows while theirs is stalled. This
technique is known as hardware multithreading, and it masks the latency issues caused by
memory access and stalling. CPUs can also employ SIMD and multithreading, but not to the
extent of GPUs given their lower core count.
Nvidia GPU design

Nvidia is a designer and manufacturer of GPUs, and is currently making inroads into the self-
driving car market. Their latest innovation in GPU architecture is known as Pascal, the 2016
successor to Maxwell architecture [8]. Pascal utilizes 150 billion 16 nanometer FinFET
transistors, making it the largest FinFET chip in the world. It is the first architecture to support
the NVlink high-speed bidirectional interconnect, which accelerates the interconnect bandwidth
between multiple GPUs five times faster than older solutions. Other advancements enable Pascal
to improve the learning speed of artificial intelligences, and it boasts 3 times the memory
bandwidth performance of the Maxwell architecture. The details of how Pascals architecture
accomplishes these improvements is restricted to individual disclosure, and will likely not be
available for public access until it is succeeded by other GPU architectures, given how Kepler
(the predecessor of Maxwell and Pascal) has had its whitepaper released.
Nvidias next generation of GPU architecture has been named Volta, and it is set to release in
2017 [9]. Volta is speculated to release with a redesigned streaming multiprocessor, improving
the computational engines of all Volta GPUs. Information on the Volta is even scarcer than that
of the Pascal, although more information may be released during the GPU Technology
Conference held by Nvidia in May. What is known is the Volta GPU architecture will be
implemented in Summit and Sierra, two U.S. Department of Energy supercomputers. Serving as
coprocessors alongside Power9 CPUs, Volta is expected to provide for around 90% of the
supercomputers floating point performance.
Conclusion
The primary method of increasing the computational power of CPUs and GPUs appears to be
dependent upon the continuous effort to reduce the size of transistors and pack more within the
same area, as well as improvements in the rate of flow of information through multi-core
processors, SIMD, and hardware multithreading. The parts that I upgraded my computer with
were similar in size to its older components, but they contained more transistors and cores within
that same size. Unfortunately, transistors may soon hit a size limit beyond which they cannot be
shrunk any further, and if computational power cant be improved through other methods, the
development of better hardware may flatline. I am interested to see how hardware manufacturers
like Intel and Nvidia deal with the size issue, and it is my hope that fields such as biology and
nanotechnology benefit and learn from the nanometer-scale work that the computer hardware
industry is currently invested in.
References
[1] Basic CPU Tutorial, accessed March 10, 2017,
https://embeddedmicro.com/tutorials/lucid/basic-cpu.
[2] Chapter 4: Processors, accessed March 10, 2017,
http://www.aries.net/demos/Server/chapter04/chapter04_1.html.
[3] Inside the CPU, accessed March 10, 2017,
http://www.belpercomputing.com/year-10/gcse-edexcel-computer-science/block-4-how-
computers-work/inside-the-cpu/.
[4] CPU Buses, accessed March 13, 2017,
https://www.doc.ic.ac.uk/~eedwards/compsys/.
[5] Fatahalian, Kayvon and Houston, Mike. A Closer Look at GPUs. In ACM Queue, vol.
51, pp. 50-57. ACM Queue, 2008.
[6] How Intel Makes a Chip, accessed March 11, 2017,
https://www.bloomberg.com/news/articles/2016-06-09/how-intel-makes-a-chip.
[7] Luebke, David and Humphreys, Greg. How GPUs Work. In IEEE Computer, vol. 40,
pp. 96-100. IEEE, 2007.
[8] Pascal GPU Architecture, accessed March 12, 2017,
http://www.nvidia.com/object/gpu-architecture.html.
[9] Speculation of NVIDIA Volta GPU Ramps Up in Anticipation of 2017 Debut,
accessed March 12, 2017,
https://www.top500.org/news/speculation-of-nvidia-volta-gpu-ramps-up-in-anticipation-
of-2017-debut/.
[10] Moores Law, accessed March 14, 2017,
http://pointsandfigures.com/2015/04/18/moores-law/.

Writing Final Myer

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Writing Final Myer

Загружено:

Авторское право:

Доступные форматы

Jakob Myer ENGR1202 Writing Assignment Final Due 4/13/2017

Computer Hardware, CPUs, and GPUs

Figure 1: Diagram of a CPU

Figure 3: Diagram of the graphics pipeline

Nvidia GPU design

Вам также может понравиться