Вы находитесь на странице: 1из 196

Computer Science Honours: COMP712

(2009)
COMPUTER GRAPHICS

(Satya Baboolal, School of Computer Science, University of KwaZulu-Natal)

Although this course largely treats the theoretical aspects of Computer Graphics, students are
expected to implement important algorithms in C/C++ utilising the OpenGL API.

Some texts on the subject:


1. D. Hearn and M.P. Baker – Computer Graphics: C Version (2nd ed., Prentice-Hall, 1997).
2. D. Hearn and M.P. Baker – Computer Graphics: with OpenGL (3nd ed., International
Edition, Pearson/Prentice-Hall, 2004).
3. D. Foley, A. van Dam, S.K. Feiner, J.F. Hughes – Computer Graphics (2nd ed., Addison-
Wesley, 1990).
4. L. Ammeraal – Computer Graphics for Java Programmers (John Wiley & Sons, 1998).
5. G.W. Rowe – Computer Graphics with Java (Palgrave Publishers, 2001).

Ref. 3 above is a classical and extensive text on the subject, but these notes have been
developed mainly from an earlier Pascal-based version of ref.1 with updates taken from ref. 2.
Page references here point to ref. 2.

Course web site


http://www.cs.ukzn.ac.za/~satya/COMP712/
(Consult this for supporting C++ programs and other material)

Other useful sites


http://www.cs.ukzn.ac.za/~satya/COMP312 (for DevCpp)
http://www.foosyerdoos.fsnet.co.uk/ (MS Windows programming)
http://nehe.gamedev.net/ (OpenGL and gaming)
http://www.opengl.org/ (Main OpenGL site)
http://www.cs.brown.edu/courses/cs123/links.htm (Courses, tutorials, games..)

Course outline
• General survey and overview of graphics systems (hardware and software)
• Fundamental algorithms for output primitives: lines, circles, ellipses, curves, area
filling, clipping
• Algorithms for controlling attributes of primitives: shading, pattern filling, colouring, anti-
aliasing
• 2D and 3D geometric transformations: translations, scalings, rotations, reflections,
shearing, inter-coordinate system transformations
• 2D and 3D viewing: World coordinate models to viewing/device coordinates
• 3D object representation/modelling: quadrics, splines and surfaces, octrees, fractal
geometry methods
• Visible surface detection, Lighting and surface rendering, Animation

Note: This document is best viewed and/or printed in colour, but students will get a b/w printed version + a pdf file
of it.
2

CHAPTER ONE: SURVEY OF COMPUTER GRAPHICS

1.1 Introduction
Previous to the 1980s Computer Graphics (CG) was a small and specialised field. With the
advent of PCs and better graphics hardware, CG now pervades virtually all modern application
software, and is commonly deployed in interactive windows applications.

The basic task of CG is to form (synthesize) an image (on an output device e.g. CRT screen)
from a model of a real object.

The basic task of the related field of image processing, is somewhat the reverse: Analyse or
reconstruct a 2D and 3D object (i.e. an enhanced image or model) from a primitive image.

1.2 Applications of CG
i) Software design of objects (models of cars, planes etc) in science and engineering
= “computer aided design” (CAD) – usually combined with virtual reality or
animations.
ii) Presentation graphics – reports for science and business, 3D graphs
iii) Computer art – commercial advertising, “paintbrush” drawings
iv) Entertainment – movies and sound
v) Education and training
vi) Visualizations and simulations of complex processes e.g. computational fluid
dynamics (CFD) and scientific or high performance computing (HPC)
vii) Image processing – to enhance and modify existing images
viii) Graphical user interface (GUI) design – used in windows applications

1.3 Graphics standards and software


Several standards have been proposed and used in the implementation of software (and
hardware low-level functions) for computer graphics, including:-
• GKS (graphics kernel system) = a set of functions (defined independently of a
programming language) - accepted by ISO and ANSI.
• PHIGS, PHIGS+ (programmer’s hierarchical interactive graphics standard) – more
comprehensive than GKS – accepted by ISO and ANSI
• GL and OpenGL – Silicon Graphics Inc graphics library including lower (machine)
level routines. Available for various platforms/languages e.g. Windows/Linux
Fortran/C/Java
• DirectX – Microsoft library – consists of low level APIs for graphics and multimedia.
• Pixar RenderMan – a software interface for generating lighted scenes
• Postscript – interpreter for page descriptions containing images
• JAVA 3D – Sun Java’s answer to above (low level binding to either OpenGL or
DirectX available)
• JOGL - joint Sun and Silicon Graphics Java native binding to OpenGL allowing
seamless OpenGL function calls from Java.

Many languages provide libraries (“bindings”) implementing one or more of the above, e.g. the
early Turbo C++ for DOS uses the Borland Graphics Interface (BGI) implementing
3

GKS/PHIGS. In addition standards or protocols for CPU – graphics hardware communications


have been proposed and implemented, e.g. Computer Graphics Interface (CGI).

So too have evolved various standards for storing graphical images as binary data on
peripheral devices, such as the Computer Graphics Metafile (CGM), Windows Bitmap (BMP),
Joint Photographic Experts Group File (JPEG), Tagged Image File Format (TIFF), Graphics
Interchange Format (GIF) and many others.

Our approach here will be to study methods for generating the fundamental building blocks on
which such libraries are constructed, namely, the “primitives” from first principles. In addition we
shall study some basic graphics algorithms employing primitives for such tasks as surface
generation, shading, geometric modelling, viewing and animation. First we review the currently
available hardware systems and software for rendering graphics.
4

CHAPTER TWO: OVERVIEW OF GRAPHICS HARDWARE AND SOFTWARE

2.1 Video display devices


Currently the most common display device is the video monitor based on the cathode ray tube
(CRT), although this is fast being overtaken by flat panel devices (LCD screens and plasma
screens).

A basic (mono-colour) CRT:

-ve biased focusing anode (+)


intensity or mag coil
control grid

screen with
inner phosphor
coating:
persistence 10-
60µs – requires
heating refresh
element
electron
emitting heated accelerating anode
(+) or mag coil electron
cathode (-)
beam

Electrons from a heated cathode are accelerated and focussed to a point beam, either with
electric fields due to +ve biased plates or with magnetic coils. When the beam hits the
phosphor coating on the screen’s inner surface, it excites its atoms to higher quantum levels,
emitting light when they fall to a lower level. Since this light generally lasts about 10-60 µs, the
screen has to be “refreshed” to maintain the displayed image. Typical refresh rates are about
60 frames per sec on standard monitors, independent of the picture complexity.

At any time the beam will hit a particular spot or point on the screen. The maximum number of
points that can be displayed without overlapping is called the resolution = the number of
points/cm that can be plotted horizontally or vertically (1-direction). The light spot intensity is
generally of Gaussian distribution, such as:-


5

Thus two spots will appear separate if the separation distance d > ∆ = the width at 60% level.
Typical resolutions for high quality systems are ~ 1280 x 1024. This is larger in “high definition”
systems.
The aspect ratio:
= the ratio of the number of vertical points-to-horizontal points required to produce equal
length lines in both directions. For example,
a ratio ¾ => vertical line plotted with 3 points
= in length to horizontal line with 4 points.

Raster-scan displays (most commonly used type)


-based on TV technology. Here the electron beam is swept across the screen from left-to-right,
and from top-to-bottom, one row at a time. As the beam traverses each row, the beam intensity
is switched on or off over each spot/dot to create a pattern of illuminated spots.

1st scan line

2nd scan line

....

last scan line

re-trace from beginning


scan line
Each screen point is called a pixel or pel (“picture element”). The set of all the intensity values
for the entire set of pixels, called the picture definition is stored in a section of memory called
a refresh or frame buffer. The stored values are retrieved from memory and the screen is
painted according to the intensities, one row at a time.

To paint each frame takes about 1/60 – 1/80 sec = the “refresh rate” or “vertical trace rate”. A
variation involves interlacing where alternate set of lines are scanned, followed by a vertical
retrace of the other set of lines and so on. This effectively reduces the refresh rate by ½ at the
expense of some picture definition and is used with slower refresh rate devices in order to
reduce the effect of flicker.

Frame buffer sizes


For black/white systems, each point is either on or off => 1 bit per pixel required.
More bits/pixel are required for intensity variations and for colour. On high resolution colour
systems typically 24 bits/pixel are used. For example,
a 1024 x 1024 system requires 1024 x 1024 x 24/8 bytes = 3mb frame buffer.

On a b/w system with 1 bit/pixel, the frame buffer is called a bitmap. For systems with > 1
bit/pixel the FB is called a pixmap.

Random scan displays


In random scan monitors, the CRT’s electron beam is directed only to parts of the screen
where the picture is drawn – done by position the beam at a starting point and moving it to an
6

ending point, one line at a time, rather than scanning the entire screen. This process is called a
vector display or stroke writing or calligraphic display. The action is similar to that employed in
pen plotters and the picture (or lines composing it) can be drawn in any order required, with
refreshing done only on those lines requiring it.

The picture definition is stored as a set of line drawing commands in a section of memory
called the refresh display file or display program or display list. The system cycles through
the set of commands in the display file, drawing each line, not necessarily horizontally, one
after another. When all lines are drawn, then it cycles back to the first line command in the
buffer. It draws all component lines of the picture ± 30-60 times/sec. These displays are not
suitable for delicate shading etc but Ok for “liney” figures for which much smoother pictures
than those of raster scan displays are obtained.

Colour CRT monitors


Colour is produced by using phosphor coatings with different emission colours on monitor
surfaces. Two basic colour generation schemes used are:-
i) beam penetration (with random scan monitors)

electron beam
glass

red layer green layer

Here the colour emitted depends on the depth penetrated by the beam: a slow beam
excites red only, whilst a fast beam reaches into and excites green. Intermediate
penetration gives orange and yellow – essentially 4 colours.

ii) shadow mask method (with raster scan monitors)


Here the inside surface of the screen is coated with a large collection of triple dots, with
each triple defining a pixel. Components of each triple are separate phosphor spots of
red (R), green (G) and blue (B):
7

R R R R
G B G B G B G B

R R R R
G B G B G B G B

To achieve colour, 3 electron guns providing 3 separate beams, one to excite one
component colour of each pixel are used. Then a particular pixel colour is made up of a
combination of the 3 primary RGB colours, whose individual component intensities can
be controlled, resulting in millions of colours.

However, alignment of the composite beam so that the R-beam hits only the R-spot,
and the G-beam hits only the G-spot etc is difficult. To achieve this a metal screen is
placed before the coated surface and which contains thousands of tiny holes to allow
the electron triple beam to pass through and strike the correct component colour spots.
This metal screen is called a shadow mask. As it sits close to the screen it can used to
correctly align the 3 electron beams as shown in the figure below.

Shadow mask and RGB beam alignment


Triple
gun
B
G
R
shadow
mask

screen G B

RGB
triple dot

Better alignment can be obtained when the pixel RGB triple dots are arranged in-line
(“precision in-line delta” CRT) rather than the RGB triad format (“delta-delta”) above.
For RGB systems the colour range is huge:-
Example:
8

If 24 bits/pixel available then can use 8 bits to store intensity levels for each
colour
=> 28=256 intensity or voltage levels for each colour or component
=> ≈ 17 × 106 colours possible – called a “full” colour or “true” colour system.

Apart from the alignment problem mentioned above, the focusing of the beam by the
magnetic focusing lens (coils) presents a problem in that not all points on the screen
are in sharp focus:

screen
focusing lens

L
in focus here

L
out of focus at
L = fixed focal length edge

Making the screen too rounded to compensate gives poor images. Thus better CRTs allow
dynamic focusing where the focal length is adjusted corresponding to the position shot at on
the screen – technology used in “flat-screen” CRTs.

Liquid crystal displays (LCDs) (a non-emissive flat panel display)


An LCD display device consists of a composite panel of 6 layers:

observer

1 2 3 4 5 6

Here, ambient or internal background light is reflected from a reflecting panel (1) to reach an
observer (O) by passing through a horizontal polarizing panel (2) with an associated horizontal
wire-grid panel which is used to align the molecules in a liquid crystal panel (4) together with a
vertical wire-grid panel (5) with an associated vertical polarizing panel (6). By setting the
current on/off on the vertical and horizontal grids, particular ‘pixel’ positions in panel 4 may
pass or block the light. Colour may be achieved by using special liquid crystal material for the
panel 4. The picture definition is usually held in a frame buffer in order to refresh the display,
since random fluctuations in the alignment of the liquid crystal molecules in panel 4 degrade
the images. For more information see the Wikipedia article:
http://en.wikipedia.org/wiki/Liquid_crystal_display

Plasma panel (an emissive flat panel display)


9

A plasma panel consists of a glass panel which encloses a gas that can be ionized, such as
neon or a combination of neon and others (for colour). The ionization is achieved by applying
firing voltages in the X- and Y-directions by conducting grids which are at right angles to each
other, thus ionizing a particular gas cell (i.e. a pixel), which then emits light. The picture
definition is held in a refresh buffer, and the refresh rate is typically 60 frames per sec.

observer

vertical grid
horizontal grid gas (x address)
(y address)

Again for additional information see:


http://en.wikipedia.org/wiki/Plasma_display

Other devices
Include thin-film electroluminescent displays, light-emitting diode (LED) panels.

3-D Viewing systems


Combinations of a CRT monitor together with a vibrating focusing mirror are used to simulate
3-D views; so are stereoscopic devices to simulate depth in scenes. Typically used in virtual
reality (VR) applications – see a text.

2.2 Raster scan system organisation


The simplest organisation involves only one processor (CPU) to execute the algorithm for your
program as well as to scan convert the graphics output primitives (e.g. line drawing) and put
the result into the frame buffer. The output device or monitor is connected to the system bus by
means of a video controller (VC), whose function is to access the video memory (frame buffer)
and then translate the pixel data into control signals for the monitor. Other systems use in
addition, a graphics display processor (graphics controller or display co-processor) specifically
for the scan conversions. In addition, whether or not the frame buffer is separate or simply a
section of main memory is also a point of differentiation.

2.2.1 Simple raster display system


An common example of a simple system with a video controller (VC) with direct access to a
dedicated section of memory for the frame buffer (FB) is shown below. Note that the frame
buffer is dual ported and lies within the address space of the CPU, so that a call to a graphics
routine (to scan convert) will place pixel values in the FB, whilst the VC can retrieve these
easily by direct access.
10

PERIPHERAL
CPU DEVICES

system bus

monitor

system frame video


memory buffer controller

The video controller cycles through the frame buffer (at ± 60 frames per sec), extracting pixel
data, one scan line at a time by generating memory addresses in synchronous with the scan.
Then the correct pixel intensity values are extracted via the FB port and fed via the VC to the
monitor as intensities and deflection voltages. Where there is no separate FB port for the VC,
the pixel data values are accessed via the system bus.

A typical VC organisation is given below:

Raster scan
generator horizontal
+ vertical
deflection
X register Y register voltages

Form memory address

colour/
Frame buffer pixel register intensity
values
Operation:
Assume FB addresses range in X: 0...xmax and in Y: 0...ymax and that the scan coordinate
system is as follows:

ymax
start scan
here
(0,0) xmax

Two registers in the VC (X reg, Y reg) holding screen coordinates are initially set to
X = 0, Y= ymax
11

Then for the pixel (x,y) the memory address for the corresponding data in the FB is formed and
the data values are extracted. These are then filled into the pixel register as intensity/colour
values which are fed to CRT as control voltages for the RGB guns, thus lighting the (x,y)
position on the screen. Then X is incremented by 1 and the FB accessed again etc until
X=xmax is done, after which,
X set to 0, Y decremented by 1,

and the FB is accessed for the next scan line. When done for the scan line Y=0, the first cycle
is complete and Y is reset to ymax, and the process repeats as the next cycle.

The above process is simplistic and generally inefficient, since it assumes that we can access
memory for each pixel without delay. Since the screen is refreshed ±60 times per sec, we
require for example, for a 640 × 480 pixels with 1-bit data, an access time of 1/(640 × 480 × 60)
= 54 nanosecs. But typical RAM cycle times are much greater (~200) so we cannot access
memory every 54 nanosecs. Thus we require a VC-FB architecture that can allow access to
multiple pixel data – perhaps stored in a separate register that gets shifted out to the CRT
periodically. However, tests show that sharing memory cycles between the VC and CPU slows
down the CPU.
Hence, various other design solutions have been proposed and implemented:
i) double buffering – 2 FBs used – 1st fills, 2nd emptied by the VC, then 2nd fills, 1st
emptied out and so on
ii) use of a video look-up (LU) tabel in the VC
iii) separate graphics processor and FB
iv) advanced integrated CPU+ graphics co—processor
For i) see later, but let’s consider ii) and iii) below.

A look-up table has as many entries as there are pixel intensity values available. For example,
for a 3-colour (RGB) system with 8 bits per pixel for data there would be 28=256 entries. The
FB at any (x,y) address ~ a coordinate position on the screen holds just an index into the table
{index range 0...256] i.e. the FB content is not used to control the intensity/colour directly.

LU table intensities
FB memory at (x,y)
cells
255 1.. 0.. .. 11

1001 R
~
(x,y) 01000011 67 1001 1010 0001 1010 G
pixel 0001 B

2
1
0 0.. 0.. .. 00

This mechanism is convenient and fast when a small colour range is required, since only the
colour index need be stored in the FB for each screen coordinate.
12

2.2.2 Raster system with a separate display processor


Here the secondary processor (DPU) is used to perform raster operations, namely scan
conversions. It has its own memory for raster functions/programs and the pixel data, i.e. has a
frame buffer. A typical organisation is:

system PERIPHERAL
CPU memory DEVICES

system bus

display processor
monitor

display proc frame video


memory buffer controller

Apart from doing simple raster operations (setting pixel positions, scan converting lines, filling
etc) a DPU can be designed to do complicated operations like windowing, coordinate
transformations, 3-D imagery etc. Additional functionality such as editing FB contents before
displaying, mixing video images with FB contents etc. may also be available. The trade-off is
the expense involved as opposed to the slower but inexpensive use of the CPU for some of
these operations.

2.3 Graphics software


There are broadly two classes of software:-
• GPPP – general purpose programming packages – usually comprise a set/library of
functions that are part of a high-level language or callable from one. Examples are
Java2D, Java3D (part of language), GL, OpenGL, VRML (latter being a computer-graphics
application programming interface or CG API).
• APP – stand alone application packages – designed for non-programmers such as Harvard
Graphics, Origin Plot, CAD packages.

2.4 Coordinate representations in graphics


In general, in order to generate a picture on a screen or output device requires one to follow a
definite sequence of steps.

• Construct the constituents in modelling or local coordinates.


• Place the components together to form a scene in world coordinates.
• Select a coordinate frame (viewing coordinates) to view the scene, defining how it is
to be projected or from which orientation/direction it is to be viewed.
• Perform a transformation of the selected view to a device independent normalized
coordinates (range 0...1 or -1 ...+1 in every direction).
• Finally map the result to actual device coordinates (VDU screen, plotter etc).
13

Thus the transformation sequence is


(xmc,ymc,zmc) → (xwc,ywc,zwc) → (xvc,yvc,zvc) → (xnc,ync,znc) → (xdc,ydc).

The sequence (xwc,ywc,zwc) → (xvc,yvc,zvc) → (xnc,ync,znc) → (xdc,ydc) is called the viewing


pipeline.

Select viewing and


projection coords
Put together in world (xvc,yvc,zvc)
coords
(xwc,ywc,zwc)
Map to normalized
coords
(xnc,ync,znc)

0
1
Map to device
coords
(xdc,ydc,zdc)
Model each item in
modelling coords
(xmc,ymc,zmc) monitor

2.5 Graphics functions


In GPPP the basic building blocks for creating, manipulating and modifying geometric objects
are graphics functions. The functions used to create the basic objects are called graphics
primitives, typically used for drawing lines, circles, arcs, etc. Properties of the output primtives
such as colours, styles etc are their attributes which are set by separate function calls or by
setting parameters in the primitives, depending on the API used. In addition functions are given
for geometric transformations and/or modelling as well as viewing transformations. Additionally
one would find functions for dealing with interactive input devices such as a mouse, touch
screen etc.

2.6 Overview of OpenGL


This is a basic library of functions for generating primitives, setting attributes, geometric and
viewing transformations and so on. Moreover, it has been designed to be hardware
independent. Auxiliary libraries have been developed to handle I/O issues which are hardware
dependent.

2.6.1 Basic syntax


• Function names – prefixed by gl and each component word begins with a capital letter.
Examples are: glBegin, glEnd, glClear, glCopyPixels, glPolygonMode
14

• Some functions take symbolic constants as parameters – prefixed by GL followed by _ and


capital letters. Examples are: GL_2D, GL_RGB, GL_POLYGON,
GL_AMBIENT_AND_DIFFUSE
• Special data types (for machine independence) required by some functions – Examples
are: GLbyte, GLshort, GLint, GLfloat, GLdouble, GLboolean
• Pointers to arrays may be passed as some function arguments
2.6.2 Auxiliary libraries
In addition to the core OpenGL library various auxiliary libraries exist for additional functionality:
• The OpenGL Utility (GLU) – for viewing and projection matrices, complex objects,
quadrics, splines, surface rendering etc. Function names are prefixed by glu.
• Open Inventor – OOP toolkit for interactive 3D applications
• Windowing functions – required to set up a display window and to interact with it. Can be
done with the MS WINAPI, GLX (X-windows extension), WGL (for MS windows-to-
OpenGL), GLUT (OpenGLUtilityToolkit). The GLUT functions are prefixed by glut and can
be used to set up a display window as well as to render curves and surfaces.

Typically, when using any of the above one would to install them (if not already packaged with
the OpenGL suite) and then include the respective header files like:

#include <windows.h>
#include <GL/gl.h>
#include <GL/glu.h>
#include <GL/glut.h> //usually omit above two if this is used
#include <cmath> // or old <math.h> etc for usual C/C++ libs
....... C/C++/OGL code
.................................
For a guide on installing and running the DevC++ compiler suite with OpenGL see the course
website.
2.6.3 GLUT display window
Setting up a somewhat minimal display window using GLUT requires:
• call to initialization routine: glutInit(&argc, argv); //args not always used
• call to create window: glutCreateWindow(“ window title here”);
• call to set up its display content, which can be specified by a graphFunc which defines the
graphics to be created: glutDisplayFunc(graphFunc); //graphFunc is yours
• call to start process, putting display contents in window, puts window program into infinite
loop, looking to process window “events”: glutMainLoop();
This should be the last function to be called.
• The window will be located in some default position and size.To use your own we call
glutInitWindowPosition(50,100); //top left corner is at (50,100)
glutInitWindowSize(400,300); //400 pixels wide, 300 high
• Other attributes may be set by calling e.g.
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB);
This allows for a single refresh buffer and RGB colour mode. Mode values can be
combined with the logical and operator.

A typical GLUT window looks like:


15

100
s
c
window title here
r
e 50
e
n 300

400

display window

2.6.4 Example of a C++/OpenGL/GLUT program


A complete working example which compiles with DevC++ under Windows XP is:

//lines.cpp
//---------
//Draw lines with OpenGL + GLUT
//Compiler DevC++ 4.9.9.2 + GLUT3.7 (Win32)
#include <windows.h>
#include <GL/glut.h>

void init(void)
{
glClearColor(1.0,1.0,1.0,0.0); //display window set to white
glMatrixMode(GL_PROJECTION); //projection parameters
gluOrtho2D(0.0,200.0,0.0,150.0); //sets up WC extent
}

void lineSegments(void)
{
glClear(GL_COLOR_BUFFER_BIT); //clears display window
glColor3f(0.0, 0.0, 1.0); //line colour set to blue
glBegin(GL_LINES);
glVertex2i(180,15); glVertex2i(10,145); //line seg1
glVertex2i(145,10); glVertex2i(15,180); //line seg2
glEnd();
glFlush(); //process all OGL functions immediately
}
int main(int argc, char** argv)
{
glutInit(&argc, argv); //initialise GLUT
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB); //sets display mode
glutInitWindowPosition(50, 100); //top left display window pos
glutInitWindowSize(400, 300); //display win width and height in pixel coords
glutCreateWindow("Example01 - lines"); //now create display win
init(); //run initilization procs
glutDisplayFunc(lineSegments); //call drawing func
glutMainLoop(); //display all and wait
return 0;
}
16

In the foreground is shown the output window thrown by this application:


17

CHAPTER THREE: OUTPUT PRIMITIVES

Pictures in CG can be constructed in terms of basic geometric structures or “primitives”. Each


such structure requires
i) input data (coordinates)
ii) attribute data (style, colour etc) for displaying the primitive
These are supplied as a set of (library) functions corresponding to the primitives (either as
composite functions for both above or with separate functions for the current attribute settings).
Before examining some basic primitives we consider the coordinate frames to be used and the
OpenGL implementations and conventions for them.

3.1 Coordinate frames


As mentioned before, objects in a scene are described by specifying their geometries as
positions in (Cartesian) world coordinates. These positions which constitute a scene
description, and additionally the objects’ colours and their min/max (x,y,z) ranges (called an
object’s extent or bounding box) are stored in memory (or on disk). The scene information is
then passed through the viewing pipeline to become scan-converted values at (x,y) positions in
screen (device) coordinates. These values are held in the frame buffer and used to refresh
the screen.

A video screen coordinate system is comprised of an integer array of pixel positions (x,y),
with typical pixels shown as centred dots, starting from (0,0) at the lower left to some (xmax,
ymax) at the top right where the y-coordinate is the scan line. In practical systems, scanning
starts from the top left taken as (0,0) down to the bottom right taken as (xmax, ymax), but software
function arguments can be set to convert to these trivially from the usual coordinate system
shown below:

y
= a pixel
.
.
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 ... x

Typically, frame buffer values can be set at the (x,y) coordinates by a call to a low-level function
like,
setPixel(x,y);
Conversely, frame buffer values for the pixel at (x,y) can be retrieved by a call to a function
such as,
getPixel(x,y,color);
where color is an integer (or other data type) corresponding to the RGB colour combination set
for the pixel (x,y)..

For 3D scenes, an additional 3rd, a depth (or z) coordinate (relative to a viewing position), may
be specified.
18

We note that the coordinate values referred to above are called absolute coordinate values
(in the particular system used). Some packages employ relative coordinates, which are
offsets relative to a current position.

3.2 OpenGL world coordinates in 2D


In OGL a 2D Cartesian WC system is set up with a call to gluOrtho2D(xmin,xmax,ymin,ymax)
where the 4 arguments are the x-y range values of the coordinate frame (see example
lines.cpp). Any object parts defined outside this range will not be shown. These coordinates
are then “orthogonally projected” (corresponding to a “perpendicular onto” view) to view
coordinates and then onto screen coordinates. This sequence is done with the function calls:

glMatrixMode(GL_PROJECTION); //one of many modes


glLoadIdentity(); //clears transformation matrix of old contents
gluOrtho2D(xmin,xmax,ymin,ymax); //set up coord sys and transforms

The OGL WC coordinate system then looks like:

general 2D space

ymax

ymin

xmin xmax OGL WC


extent/window

Note that when the picture constituents are defined, they must be given in absolute (world)
coordinates. Thereafter the OGL/GLUT display functions may be constructed and called. The
function call sequence above will map these to screen coordinates as on p. 15: thus xmin will be
mapped to an integral screen or pixel coordinate value and so on.

3.3 Some OpenGL 2D modelling functions


3.3.1 Points
Defined by specifying a vertex in WCs as follows:
glBegin(GL_POINTS);
glVertex*(...);
glEnd();

Here glVertex*(..) can take many forms such as:


• glVertex2i(..) for integer 2D coordinate arguments
• glVertex2s(..) for short integer 2D coordinate arguments
• glVertex2f(..) for float 2D coordinate arguments
19

• glVertex2d(..) for double 2D coordinate arguments


• glVertex2iv(..) for integer 2D vector coordinate argument etc

Examples:
(i) glBegin(GL_POINTS); 300
glVertex2i(50,100);
250
glVertex2i(75,150);
200
glVertex2i(100,200);
glEnd(); 150
100

This will construct the following points in WCs: 50

50 100 150 200 250 300 350


(ii) int point1[ ] = {50,100};
int point2[ ] = {75,150};
int point3[ ] = {100,200};
glBegin(GL_POINTS);
glVertex2iv(point1);
glVertex2iv(point2);
glVertex2iv(point3);
glEnd();

This is a vector form equivalent of (i).

(iii) glBegin(GL_POINTS);
glVertex3f(-51.25,101.56,10.34);
glVertex3f(234.61,-1213.75,170.25);
glEnd();

This will construct two float points in 3D WCs.

(iv) A C++ class or struct may also be used:


class wcPt2D
{
public:
GLfloat x,y;
}

wcPt2D pointPos;
pointPos.x = 65.75;
pointPos.y = 32.37;
glBegin(GL_POINTS);
glVertex2f(pointPos.x,pointPos.y);
glEnd();

(v) A C++ function can be constructed to implement a setPixel() function using the
OGL point functions above (see later).
20

3.3.2 Lines
Many graphics packages provide a function for drawing a single straight line or many line
segments. In OGL, we use the constant GL_LINES in the glBegin() function segment
together with the vertex functions to define line end points. Line segments are then drawn by
taking successive pairs of vertices. Any vertex not paired with another is ignored. Other
primitive-constants are also available.

Examples (give figures similar to RHS):


int p1[ ] = {-5,-100};
p3
int p2[ ] = {75,50};
int p3[ ] = {10,100};
p2
int p4[ ] = {55,-95};
int p5[ ] = {2,30};
//Line segments
glBegin(GL_LINES);
glVertex2iv(p1);
glVertex2iv(p2);
glVertex2iv(p3);
glVertex2iv(p4);
glVertex2iv(p5); p1 p4
glEnd();

//A polyline
glBegin(GL_LINE_STRIP); p3
glVertex2iv(p1);
glVertex2iv(p2); p2
glVertex2iv(p3);
glVertex2iv(p4); p5
glVertex2iv(p5);
glEnd();

p1 p4

//A closed polyline p3


glBegin(GL_LINE_LOOP);
glVertex2iv(p1); p2
glVertex2iv(p2);
glVertex2iv(p3); p5
glVertex2iv(p4);
glVertex2iv(p5);
glEnd();

p1 p4
21

3.4 Fundamental line drawing algorithms


A line drawing function typically takes the two endpoints as arguments and then calculates the
path between them. Since the final result is converted to integer-value pixel or device
coordinates, errors occur when rounding floating point numbers. For example a point (21.25,
67.81) may be rounded to (21,68) for display purposes. On raster systems, the line then looks
jagged, such as:

3.4.1 Line equations


Recall the slope-intercept equation of a straight line,
y = m. x + b; m = slope, b = y -intercept (3.1)
For the endpoints ( x0 , y0 ) and ( xend , yend ) we have,

yend − y0
yend m= (3.2)
xend − x0

y0 b = y0 − m. x0 (3.3)

x0 xend

The algorithms are then based on equations (3.1) – (3.3) by noting that:
For given x -interval δ x along OX, the corresponding interval along OY is
δ y = m.δ x. (3.4)
Similarly, corresponding to an interval δ y along OY is, along OX, the interval
1
δ x = δ y. (3.5)
m

In analogue systems, deflection voltages are applied corresponding to equations (3.4) – (3.5),
namely,
when m < 1 , δ x is set proportional to a small horizontal deflection voltage and
δ y is calculated from (3.4).
But, when m ≥ 1 , δ y is set proportional to a small vertical deflection voltage and
δ x is calculated from (3.5).

In raster systems, lines are plotted with pixels, by stepping in units of δ x or δ y but
constrained by the pixel (grain) separation. That is, we sample the line at discrete positions and
find the nearest pixels to these positions according to the situation below:
22

sample along OY ↑
m ≤1
m >1
← calculate the y-values
along OX→ calculated pixel
calculated pixel

sample along OX→ calculate the x-values →

Quiz: Why should we not simply sample along just one coordinate (e.g. OX) and compute
along the other in both cases?

3.4.2 The digital differential analyzer (DDA) algorithm


This is a scan conversion algorithm based on calculating either δ x or δ y , from (3.4)/(3.5).
The line is sampled in unit (1-pixel at a time) (integer) positions and the other coordinate
positions are calculated by rounding to the nearest pixels.

Case A: Positive slope m > 0


(i) For m ≤ 1
Sample at unit x-intervals (i.e. δ x = 1 ) and compute successive y-values
from yk +1 = yk + m (3.6)
st
where k = 0 (~1 point),1,2,3,4,... k max (~ last point). Then round y-
values to nearest pixel.
(ii) For m > 1
Sample at unit y-intervals (i.e. δ y = 1 ) and compute successive x-values
1
from xk +1 = xk + (3.7)
m
where k = 0 (~1st point),1,2,3,4,... k max (~ last point). Then round x-
values to nearest pixel.

Here (3.6) – (3.7) imply L-to-R processing. For R-to-L processing we use
m ≤ 1: yk +1 = yk − m (3.8)
1
m > 1: xk +1 = xk − (3.9)
m

Case B: Negative slope m < 0


(i) For m ≤ 1
Sample at unit x-intervals (i.e. δ x = 1 ) and compute successive y-values
from yk +1 = yk + m (3.6)
where k = 0 (~1st point),1,2,3,4,... k max (~ last point). Then round y-
values to nearest pixel.
(ii) For m > 1
23

Sample at unit y-intervals (i.e. δ y = 1 ) and compute successive x-values


1
from xk +1 = xk + (3.7)
m
where k = 0 (~1st point),1,2,3,4,... k max (~ last point). Then round x-
values to nearest pixel.

Again this implies L-to-R processing; for R-to-L processing we use


m ≤ 1: yk +1 = yk − m (3.8)
1
m > 1: xk +1 = xk − (3.9)
m

A typical algorithm using these equations is given in Ref. 2 (p.95). This


method is faster than using (3.1) directly, since it eliminates unnecessary
multiplications, but, rounding causes round-off errors to propagate through the
calculation process causing significant departure in the computed line from the
true mathematical line. A better algorithm is the following:

3.4.3 Bresenham’s line drawing algorithm


This is a scan conversion algorithm that is both more accurate and more efficient than the DDA
algorithm. It can also be adapted to handle other curves.
Consider the starting pixel coordinate positions ( ● = starting lit pixel) shown below:
y y

19 19
18 18
line path line path
17 17
16 16
15 15
14 14
13 13
12 12
11 11
10 10
10 11 12 13 14 15 16 17 18 ... x 10 11 12 13 14 15 16 17 18 ... x

Case A: Positive slope 0 < m < 1


Taking the LH start point as ( x0 , y0 ) and moving in unit increments along the columns (x) we
will determine the corresponding pixels whose scan line y-values lie closest to the line path. For
example, suppose the k-th pixel is already done i.e. have found ( xk , yk ) and we next wish to
choose between ( xk +1 , yk ) and ( xk +1 , yk +1 ) :

y = mx + b yk +1
yk +3 dU
y
yk +2 yk dL
y k +1
yk
xk +1
10
xk xk+1 xk+2 xk+3
24

Now consider the y-coordinate value where the line xk +1 meets the mathematical line, and let
dU and d L (both > 0) be the vertical separations from y of yk +1 and yk respectively. Then
with xk +1 = xk + 1 we have
y = m ( xk + 1) + b (3.10)
=> d L = y − yk = m( xk + 1) + b − yk
and, dU = yk +1 − y = yk + 1 − m( xk + 1) − b
Hence the difference between the two separations is
d L − dU = 2m ( xk + 1) − 2 yk + 2b − 1 (3.11)
To make the right choice conveniently, we introduce a decision parameter pk for the k-th step,
involving only integer calculations:
∆y vertical endpoints separation integer
Noting that m = = =
∆x horizontal endpoints separation integer

define pk = ∆ x ( d L − d U ) .
Then pk = ∆x ( d L − dU ) = 2∆y. xk − 2∆x. yk + c , (3.12)
where c = 2 ∆y + ∆x (2b − 1) = an integer constant
with sign ( pk ) = sign ( d L − dU ) ,
since ∆x > 0 in this case.

Now the pixel at yk is closer to this line


if d L < dU i.e. sign( pk ) < 0
otherwise, the pixel at yk +1 is closer i.e. when dU < d L or sign( pk ) > 0 .
To advance the algorithm recursively we can compute successive decision parameters as
follows:-
At the step k + 1 , we have
pk +1 = 2∆y. xk +1 − 2∆x. yk +1 + c .
From (3.12) pk +1 − pk = 2∆y ( xk +1 − xk ) − 2∆x( yk +1 − yk )
i.e. pk +1 = pk + 2∆y − 2∆x ( yk +1 − yk ) (3.13)
0, when pk < 0, since reqd. pt is ( xk +1 , yk )
in which we use yk +1 − yk =  .
1, when pk > 0, since reqd. pt is ( xk +1 , yk +1 )

0, if pk < 0
Thus we take pk +1 = pk + 2∆y − 2∆x ×  (3.14)
1, if pk > 0
Now if ( x0 . y0 ) is the starting point then,
p0 = 2∆y. x0 − 2∆x. y0 + 2∆y + 2∆x.b − ∆x
(3.15)
= 2∆y − ∆x + 2∆y. x0 − 2∆x( y0 − b) = 2∆y − ∆x

Thus a recursive algorithm based on the above is:


25

Bresenham’s line algorithm for m < 1:


1. Input line endpoints. Take LH point as ( x0 , y0 )
2. Load ( x0 , y0 ) into the FB i.e. plot the 1st point
3. Calculate the constants ∆x, ∆y ,2∆x,2∆y ,2∆y − 2∆x, p0 = 2∆y − ∆x
4. For k = 0,1, 2,...k max (~ last endpoint) i.e. repeat ∆x times:
If pk < 0, take the next point as ( xk +1 , yk ) i.e. yk +1 = yk
and set pk +1 = pk + 2∆y
else (if pk ≥ 0 ) take the next point as ( xk +1 , yk +1 ) i.e. yk +1 = yk + 1
and set pk +1 = pk + 2∆y − 2∆x

Example: Digitize the line from (20,10) to (30,18) in device coordinates.


Solution:
Here ∆x = 10, ∆y = 8, m = ∆y / ∆x = 0.8.
The initial decision parameter is p0 = 2∆y − ∆x = 6 > 0 and 2∆y = 16, 2∆y − 2∆x = −4.
Thus from the point (20,10) we tabulate the results:

k pk ( xk +1 , yk +1 )
0 6 (21,11) y
1 2 (22,12)
2 -2 (23,12) 18
3 14 (24,13) 17
4 10 (25,14) 16
5 6 (26,15) 15
6 2 (27,16) 14
7 -2 (28,16) 13
8 14 (27,17) 12
9 10 (30,18) 11
10
10
20 21 22 23 24 25 26 27 28 29 30 31 x

The selected pixels are shown on the RHS above with the red line indicating the theoretical
(mathematical) line.
For an implementation of this algorithm, for | m |< 1 see H&B (p.98). To generalize it to lines of
arbitrary slope we note:-
i) Use symmetry between octants and quadrants to economize on calculations
ii) For slopes > 1 (+ve) step along OY and calculate successive x-values
iii) To plot from R-to-L, both x and y decrease (i.e. steps ∆ = − ve ) when slope is +ve.
To get the same pixels when processing L-to-R and R-to-L choose, during the entire
process, one of the upper or lower of the two candidate pixels whenever d L = dU .
For slopes < 0 note that one coordinate decreases whilst the other increases.
iv) For the special cases ∆y = 0 (horizontal line) or ∆x = 0 (vertical) we can load the
FB without any processing.
26

3.5 Circle generation


3.5.1 Basic equations
Bresenham’s algorithm can be adapted to handle circles. First we review some basic
equations.

The equation for the circle on the RHS is

( x − xc ) + ( y − yc ) = r 2 = const
2 2
( x, y )
yc r

xc

Then we step along OX and calculate y-values from


y = y c ± r 2 − ( x − xc )
2

However, this is a poor method, resulting in uneven pixel distribution and involves too many
computations in terms:

dense

sparse

The unevenness can be eliminated by using parametric polar coordinates:

( x, y )
r
yc θ
x = xc + r cosθ 
 ,0 ≤ θ ≤ 2π (3.16)
y = yc + r sin θ 

xc

Then we step along OX and calculate y-values as follows:-


i) Use a fixed step of ∆θ in θ (size dependent on device resolution), to give equally
spaced points along the circumference or
ii) use a larger ∆θ and connect points on the circumference by straight lines or
iii) For better continuity, set step to ∆θ = 1/ r to give pixel positions ≈ 1 unit apart.
iv) Calculate pixels for one octant only and apply symmetry to the other octants:
27

calculate these

45o

use symmetry

However, employing (3.16) involves too many calculations (trigonometric functions). The
following adaptation of Bresenham’s method is faster and more efficient.

3.5.2 The midpoint circle algorithm


Basic idea: Sample in unit intervals along one coordinate and calculate the nearest pixels to the
circle path in the other coordinate. In fact we need only process one octant and use symmetry
for the other octant and quadrants:

calculate these
(x,y)
45o

use symmetry

(0,0)

Thus we take the standard octant centred at (0,0) and start at the point (0,r). With this point
assumed done, we shall take unit steps along OX and by constructing decision parameters as
we did for lines, we determine the nearest of the two pixels to the circle path according to the
calculated y-values.
First, define the circle function
fc = x2 + y2 − r 2 (3.17)
Now, if ( x, y ) lies on the circle boundary, then it satisfies
f c ( x, y ) = 0.
But, if ( x, y ) lies outside then f c ( x, y ) > 0 etc. Thus for any ( x, y ) we have
0, if ( x, y ) on boundary

f c ( x, y ) = < 0,if ( x, y ) inside (3.18)
 > 0,if ( x, y ) outside

This test is done at each step, i.e. f c is our decision parameter.

Now assume that ( xk , yk ) has been done and we want the next pixel i.e. whether,
the pixel at position ( xk +1 , yk ) is closer or
the pixel at position ( xk +1 , yk −1 ) is closer:
28

yk U

yk −1 L

xk xk +1 xk + 2
midpoint line between pixel candidates U and L

To answer this question, evaluate the circle function at the midpoint y-value i.e define
 
2
 1
pk f c  xk +1 , y 1  = ( xk +1 ) +  yk −  − r 2
2
(3.19)
 k−
2   2
at the k-the step.
Now, if
i) pk < 0 then the midpoint is inside and the pixel at scan line yk is closer
ii) pk ≥ 0 then the midpoint is outside (or on) and the pixel at scan line yk −1 is
closer.
To continue the process we obtain successive decision parameters recursively, consider the
decision parameter at the k+1st step:
2
 1  1
pk +1 = f c  xk +1 + 1, yk +1 −  = [( xk +1 ) + 1] +  yk +1 −  − r 2
2

 2  2
= .... (3.20)
= pk + 2( xk + 1) + ( yk2+1 − yk2 ) − ( yk +1 − yk ) + 1
where,
if pk < 0 ,then take yk +1 = yk
else if pk ≥ 0 , take yk +1 = yk − 1 .
i.e.
pk +1 = pk + ∆pk 

 2 xk +1 + 1, if pk < 0 
  (3.21)
where ∆pk =  2 xk +1 + 1 − 2 yk +1 , if pk ≥ 0 
 with 2 x = 2 x + 2,2 y = 2 y − 2 
 k +1 k k +1 k 

We start the process at (0, r ) where ( x0 , y0 ) = (0, r ) with the initial decision parameter,
1
p0 = f c (1, r − )
2
2
 1 5
i.e. p0 = 1 +  r −  − r 2 = − r ....from (3.19) (3.22)
 2 4
If r = integer, we round p0 to p0 = 1 − r .
29

Remark: Since all increments are integers, and all calculations in the above are integer
calculations (i.e. no √ ‘s or trigonometric functions!), the method is fast and efficient. To recap
precisely we have the:

Midpoint circle algorithm


1. Input radius r and centre ( xc , yc ) . For a circle centred at (0, 0) take as start
( x0 , y0 ) = (0, r )
5
2. Calculate p0 = − r
4
3. Then for k = 0,1, 2,3,...
If pk < 0 ,then take next point as ( xk + 1, yk )
and set pk +1 = pk + 2 xk +1 + 1
else if pk ≥ 0 , take next point as ( xk + 1, yk − 1)
and set pk +1 = pk + 2 xk +1 + 1 − 2 yk +1
with 2 xk +1 = 2 xk + 2, 2 yk +1 = 2 yk − 2

4. Determine by symmetry the points in the other 7 octants


5. Transform the calculated pixels to those for a circle centred at ( xc , yc ) by:
x = x + xc , y = y + yc and plot ( x, y )
6. Repeat steps 3 – 5 until x ≥ y i.e. until the 45o line has been reached or passed.

Example: Obtain and plot the first quadrant pixels for the circle with centre (0, 0) and radius
r = 10 .
Solution:
With the starting point ( x0 , y0 ) = (0,10) we use the initial decision parameter
p0 = 1 − r = −9 and 2 x0 = 0, 2 y0 = 20 .
The results are tabulated below, with the pixels obtained on the right. The red curve is the arc
of the mathematical circle.

k pk ( xk +1 , yk +1 ) 2 xk +1 2 yk +1 y
0 -9 (1,10) 2 20 10
1 -6 (2,10) 4 20 y=x
9
2 -1 (3,10) 6 20 8
3 6 (4,9) 8 18 7
4 -3 (5,9) 10 18 6
5 8 (6,8) 12 16 5
6 5 (7,7) 14 14 4
3
(● = calculated, ○ = by symmetry) 2
1
0
0 1 2 3 4 5 6 7 8 9 10 11 x

The selected pixels are shown on the RHS above with the red line indicating the theoretical
(mathematical) line. For a sample code see H&B p. 108 (also circle.cpp).
30

3.6 Generating ellipses


Can be done by modifying the midpoint circle algorithm.

3.6.1 Basic equations


One definition: The set of all points ( x, y ) whose sum of distances from two fixed points
( F1 , F2 ) , called the foci, is a constant defines an ellipse.

P( x, y )

d1 d2
F1
F1

d1 + d2 = constant

For given foci F1 ( x1 , y1 ) and F2 ( x2 , y2 ) a point P( x, y ) lies on an ellipse boundary if and

( x − x1 ) + ( y − y1 ) + ( x − x2 ) + ( y − y2 ) = const
2 2 2 2
only if (3.23)
i.e. we can write the equation for an ellipse as
Ax 2 + By 2 + Cxy + Dx + Ey + F = 0 (3.24)
where A, B, C , D, E , F are constants that depend on the foci and two other constants called
the semi-minor and semi-major axes.
For an ellipse in “standard form” with centre ( xc , yc ) the semi-major axis length ( rx ) and the
semi-minor length ( ry ) are measures along axes parallel to the coordinate axes, through
( xc , yc ) :

ry
yc θ
rx

xc

It can then be shown that the equation for the ellipse may be written as
2 2
 x − xc   y − y c 
  +   = 1 (3.25)
 rx   ry 
or, in polar coordinates as:

x = xc + rx cos θ 
 , 0 ≤ θ ≤ 2π (3.26)
y = yc + ry sin θ 
Note here that symmetry exists between quadrants, but not between octants.
31

3.6.2 The midpoint ellipse algorithm


We first consider an ellipse in the standard position with ( xc , yc ) = (0,0) and rx , ry as shown:

y
region 1
ry

slope = -1 ( − x, y ) ( x, y )
region 2
x
rx

( − x, − y ) ( x, − y )

Points obtained by symmetry from (x,y)

We start by considering the 1st quadrant split into two regions (region 1 and region 2),
separated by the dividing line “----“, which meets the tangent line where the slope = -1.
Now, beginning with the point (0, ry ) take unit steps
along OX whenever |slope| < 1, or
along OY whenever |slope| > 1,
For example as the above indicates, with rx < ry move clockwise until we reach the point
where slope = -1, then switch to steps along OY.

To obtain a decision parameter define the ellipse function

f e = ry2 x 2 + rx2 y 2 − rx2 ry2 (3.27)

Then for any ( x, y )


0, if ( x, y ) on boundary

f e ( x, y ) =  < 0,if ( x, y ) inside (3.28)
 > 0,if ( x, y ) outside

As before, at each sampling position select the next pixel along the path according to the sign
of the function f e ( x, y ) evaluated at the midpoint between the two candidate pixels.
Now, starting from (0, ry ) , we proceed in unit steps along OX, checking at each step whether
the boundary between region 1 and region 2 has been reached by testing the slope condition:
dy 2r 2 x
≡ − y2 = −1
dx 2 rx y
i.e. whether 2ry2 x = 2 rx2 y
Thus, we move out from region 1 whenever
2ry2 x ≥ 2 rx2 y (3.29)
32

We generalize the process, as we had done for a circle, by considering the midpoint values of
f e ( x, y ) as follows:
Assume that ( xk , yk ) has been done and we want the next pixel i.e. whether,
the pixel at position ( xk +1 , yk ) is closer or
the pixel at position ( xk +1 , yk −1 ) is closer:

yk U

yk −1 L

xk xk +1 xk + 2
midpoint line between pixel candidates U and L

To this end, evaluate the ellipse function at the midpoint y-value i.e define
   1
2

pk f e  xk +1 , y 1  = ry2 ( xk +1 ) + rx2  yk −  − rx2 ry2


2
(3.30)
 k−
2   2
at the k-th step.
Now, if
i) pk < 0 then the midpoint is inside and the pixel on scan line yk is closer
ii) pk ≥ 0 then the midpoint is outside (or on) and pixel on scan line yk −1 is closer

To continue the process we obtain successive decision parameters recursively, so for the k+1st
step:
2
1 2 1
pk +1 = f e ( xk +1 + 1, yk +1 − ) = ry ( xk +1 + 1) + rx  yk +1 −  − rx2 ry2
2 2

2  2
(3.31)
2 
 1 
2
1 
2

= pk + 2 ry ( xk + 1) + ry + rx  yk +1 −  −  yk −  
2 2

 2  2  
where,
if pk < 0 ,then take yk +1 = yk
else if pk ≥ 0 , take yk +1 = yk − 1.
i.e.
pk +1 = pk + ∆pk 

 2 ry xk +1 + ry , if pk < 0
2 2

 2  (3.32)
where ∆pk =  2 ry xk +1 + ry − 2 rx yk +1 , if pk ≥ 0
2 2

 2
 with 2 ry xk +1 = 2 ry xk + 2ry ,2rx yk +1 = 2rx yk − 2 rx 
2 2 2 2 2

We start the process at ( x0 , y0 ) = (0, ry ) with the initial decision parameter,


33

1
p0 = f e (1, ry − )
2
2
 1 1
i.e. p0 = r + r  ry −  − rx2 ry2 = ry2 − rx2 ry + rx2
2
y x
2
....from (3.30) (3.32)
 2 4
At this ( x0 , y0 ) = (0, ry ) the terms 2ry x and 2rx y are
2 2

2ry2 x = 0 (3.33)
2r y = 2 r r
x
2 2
x y (3.34)

Their updated values when ( x, y ) are incremented are obtained by:


adding 2ry2 to (3.33)
subtracting 2rx2 from (3.34).
Also, at each update the condition,
2ry2 x ≥ 2 rx2 y (3.35)
is checked to see if region 2 has been reached.

For region 2, start at ( x0 , y0 ) = last position selected in region 1 and sample in unit steps in
the –ve y-direction and take the midpoint x-value as the line between xk and xk +1 :

yk

yk −1 L R

xk xk +1 xk + 2
midpoint line between pixel candidates L and R

Using a prime to denote the decision parameter, we have


   1
2

pk′ f e  x 1 , yk − 1 = ry2  xk +  + rx2 ( yk − 1) − rx2 ry2


2
(3.36)
 k+2   2

Now, if
i) pk′ > 0 then the midpoint is outside and the pixel at xk is closer
ii) pk′ ≤ 0 then the midpoint is inside (or on) and the pixel at xk +1 is closer

Further, at the next sampling position, yk +1 − 1 = yk − 2 the decision parameter is


2
 1   1
f e  xk +1 + , yk +1 − 1 = ry2  xk +1 +  + rx2 ( yk − 1 − 1) − rx2 ry2
2
pk′ +1 (3.37)
 2   2
34

 2
1  1 
2

Thus, pk′ +1 = pk′ + r + r  xk +1 +  −  xk +   − 2rx2 ( yk − 1)


x
2 2
y (3.38)
 2  2  
 xk , if pk′ < 0
xk +1 = 
in which we take  xk + 1, if pk′ ≥ 0

The initial value of the decision parameter is


1
p0′ = f e ( x0 + , y0 − 1); ( x0 , y0 ) = last point selected in region 1
2
2
2 1
= ry  x0 +  + rx2 ( y0 − 1)2 − rx2 ry2 (3.39)
 2

For ellipses in non-standard positions we map the selected points by shifts (translations) and/or
rotations (see later).

Remark: Again here all calculations in the above are integer calculations so that the method is
fast and efficient.

In summary the precise algorithm is:

Midpoint ellipse algorithm


1. Input semi-major radii rx , ry and centre ( xc , yc ) .
For an ellipse centred at (0, 0) take as start
( x0 , y0 ) = (0, ry )
2. Calculate for region 1
1
p0 = ry2 − rx2 ry + rx2
4
3. Then for k = 0,1, 2,3,... (in region 1)
If pk < 0 ,then take next point as ( xk + 1, yk )
and set pk +1 = pk + 2 ry2 xk +1 + ry2
else if pk ≥ 0 , take next point as ( xk + 1, yk − 1)
and set pk +1 = pk + 2 ry2 xk +1 − 2 rx2 yk +1 + ry2
with 2 ry2 xk +1 = 2 ry2 xk + 2 ry2 ; 2 rx2 yk +1 = 2 rx2 yk − 2 rx2
and continue until 2ry2 x ≥ 2 rx2 y .
4. Determine the initial decision parameter for region 2 using the last point in region 1 as
the new ( x0 , y0 ) :
2
 1
p0′ = ry2  x0 +  + rx2 ( y0 − 1) − rx2 ry2
2

 2
5. Then for k = 0,1, 2,3,... (in region 2)
If p′k > 0 , then take next point as ( xk , yk − 1)
and set pk′ +1 = pk′ − 2rx2 yk +1 + rx2
else if p′k ≤ 0 , take next point as ( xk + 1, yk − 1)
35

and set pk′ +1 = pk′ + 2 ry2 xk +1 − 2 rx2 yk +1 + rx2


with 2 ry2 xk +1 = 2 ry2 xk + 2 ry2 ; 2 rx2 yk +1 = 2 rx2 yk − 2 rx2
incrementing/decrementing x and y as in region 1 until y = 0 .

6. Use symmetry to find the points in the other 3 quadrants.

7. Transform the calculated pixels to those for an ellipse centred at ( xc , yc ) by:


x = x + xc , y = y + yc and plot ( x, y )

Example: Obtain and plot the first quadrant pixels for the ellipse with centre (0, 0) and radii
rx = 8, ry = 6 .
Solution:
With the starting point ( x0 , y0 ) = (0, 6) we obtain
2ry2 x = 0 (with increment 2 ry2 = 72 )
2rx2 y = 2 rx2 ry (with increment −2 rx2 = −128 )
Then for region 1: The initial decision parameter is
1
p0 = ry2 − rx2 ry + rx2 = −332
4
Then we tabulate successive values as follows:

k pk ( xk +1 , yk +1 ) 2 ry2 xk +1 2 rx2 yk +1
0 -332 (1,6) 72 768
1 -224 (2,6) 144 768
2 -44 (3,6) 216 768
3 208 (4,5) 288 640
4 -108 (5,5) 360 640
5 288 (6,4) 432 512
6 244 (7,3) 504 384

Now, since 2ry2 x ≥ 2 rx2 y we move out of region 1.


For region 2: The initial point is ( x0 , y0 ) = (7,3) giving initial decision parameter
2
 1
p0′ = ry2  x0 +  + rx2 ( y0 − 1) − rx2 ry2 = −151
2

 2
Tabulated values for this region are then
k p′k ( xk +1 , yk +1 ) 2 ry2 xk +1 2 rx2 yk +1
0 -151 (8,2) 576 256
1 233 (8,1) 576 128
2 745 (8,0) _ _

The computed pixels nearest the ellipse boundary are shown in figure below.
36

10
9
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10 11 x

The selected pixels are shown on the RHS above. For a sample code see H&B p. 116 (or
ellipse.cpp).

3.7 Other curves


Other regular curves can be constructed either directly from their equations x = f ( x ) , which
of course is computationally more expensive, or by using piece-wise line and/or circular or
elliptic portions. For conic sections such as hyperbolas and parabolas methods similar to the
midpoint algorithms can be constructed.

3.8 Refining primitives – pixel addressing and object geometry


In going from exact mathematical screen coordinates to pixel screen coordinates the
geometries can change, since a mathematical point has to be mapped (inexactly) to a finite
sized pixel. For example, consider the mathematical line: (20,10) → (30,18) given in precise
screen coordinates, which correspond to the grid line positions. For this line we have,
∆x = 10, ∆y = 8 .
When scan converted by Bresenham’s algorithm to pixel screen coordinates, which correspond
to the half-way positions between the coordinate lines, we obtain one more pixel (indicated by
the open circle), giving
∆x = 11, ∆y = 9 .

= extra point
y

18
17
16
15
14
13
12
11
10
20 21 22 23 24 25 26 27 28 29 30 31 x
37

This can be corrected by plotting only those points inside the endpoints (20,10) to (30,18), i.e.,
in this case leave out the last (top) pixel.

Similar effects are obtained for other shapes. For example, for the enclosed rectangle (0,0),
(4,0), 4,3), (0,3) we have:

mathematical scan converted corrected

y y y

5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
0 0 0
0 1 2 3 4 5 x 0 1 2 3 4 5 x 0 1 2 3 4 5 x

Below is an example for the circle: ( x − 10 ) + ( y − 10 ) = 52


2 2

y midpoint-circle scanned y modified with diam=10

15 15
14 14
13 13
12 12
11 (10,10) 11 (10,10)
10 10
9 9
8 8
7 7
6 6
5 5
x 4 5 6 7 8 9 10 11 12 13 14 15 x
4 5 6 7 8 9 10 11 12 13 14 15

For an example of the kind of procedure to be followed see H&B p. 123.

3.9 Filled-area primitives


Quite often one needs to fill a bounded figure with a particular colour. Most packages provide
primitives for the filling of polygons only, with the implication that other shapes will be
constructed from polygons. Mathematically, a polygon is a plane figure specified by a set of
three or more coordinate positions, called vertices, which are connected by straight line
segments called edges or sides.

Polygons are useful for approximating curved surfaces (‘surface tessellation’). Figures can be
constructed with them, as wire-frame models, after which it is easy to fill or render the surfaces
so that they appear realistic.

Polygon filling algorithms on raster systems generally use two procedures:


i) Move along each scan line, find wherever it intersects with the boundaries, and fill
(set pixels) between the boundary points with the required colour – used for
regular figures.
38

ii) Start from a point inside and paint outward until boundaries are met – used for
complicated shapes and in interactive painting software.

Before studying such algorithms we first consider some properties of polygons.

a convex polygon a concave polygon a degenerate polygon

all
angles > 180o
< 180o

Convex polygons present no problems when designing filling algorithms, but concave ones
have to be recognised as such and then are typically split into two or more convex ones first.
Degenerate ones, however, require special treatment.

3.9.1 Splitting concave polygons


One method involves vector cross products:
For example consider,

v6 v5
E5
E4
v4
E6 E3
v3
E2
E1
v1 v2

Now for each vertex pair define the edge vectors


E1 = v2 − v1 , E 2 = v3 − v2 ,..., E5 = v6 − v5 , E6 = v1 − v6
by considering, for example, traversal in the anti-clockwise direction. Next find for successive
pairs of these edge vectors the cross products, E1 × E2 , E2 × E3 ,..., E6 × E1 . Then if the z-
component of any cross product < 0, the polygon is concave.
Thus here,
( E1 × E2 ) z > 0, ( E2 × E3 ) z > 0, ( E3 × E4 ) z < 0,
( E4 × E5 ) z > 0, ( E5 × E6 ) z > 0,( E6 × E1 ) z > 0,
39

so the polygon is concave. Then for those pairs for which the z-component < 0, we split the
polygon, along the line of the first vector in the pairs. Hence here, we split along E3 .

Example:
3
E5
2 E6
E4
1
E2
E3
E1
0 1 2 3

Here
E1 = (1,0, 0), E2 = (2,1, 0) − (1,0, 0) = (1,1,0), E3 = (1, −1, 0),
E4 = (0, 2, 0), E5 = ( −3, 0,0), E6 = (0, −2, 0).
Then,
( E1 × E2 ) z = (0,0,1) z > 0,( E2 × E3 ) z = (0,0, −2) z < 0, ( E3 × E4 ) z = (0, 0, 2) z > 0,
( E4 × E5 ) z = (0,0, 6) z > 0,( E5 × E6 ) z = (0,0, 6) z > 0,( E6 × E1 ) z = (0, 0, 2) z > 0.
Thus, the polygon is concave and we split along the line E2 i.e. along the line with slope = +1
and y-intercept = -1 (confirm this). The polygon splits into 2 convex ones, but since there is no
other vector pair for which (...)z < 0, no further splits are necessary.

Other methods (e.g. the rotational method) are also used - see H&B p.126.

3.9.2 Splitting convex polygons into triangles


Most primitives for convex polygon filling employ a procedure to first reduce them into
triangles. With some care this procedure may also be applied to concave polygons. See H&B
p. 127.

3.9.3 Inside-outside testing


In order to correctly fill a closed figure, it is required to identify its interior or exterior. This is
easy to determine for standard polygons, but is non-trivial for more complicated ones, such as
those with self-intersections. Two methods in common use are:-
i) the odd-even rule (also even-odd or odd parity rule):
Draw an imaginary line from a point P (to be tested) to the outside, i.e. far away
from the object. Then
• if odd number of edges are crossed take P as an interior point
• else, take P as an exterior point
Note: Imaginary line must not pass through a vertex.

ii) the non-zero winding rule


Here we,
Initialize a winding number w = 0.
Again imagine a line from a point P to be tested to a distant point outside, without
passing through a vertex
Move from P along the line to the distant point
40

Count the number of edges crossed when edges around the test point P are
traversed counter-clockwise.
If any edge is crossed R-to-L set w = w+1
If any edge is crossed L-to-R set w = w-1

Then
• if w ≠ 0, take P as an interior point
• else (w = 0) P is exterior

Examples:

odd-even rule winding rule

exterior exterior

interior interior

We see that these methods can give different results for non-standard polygons.

To determine the direction of the edge crossings (R-to-L or L-to-R) we use one of:-
i)

A
d
U
P EAB
B

Let EAB be the vector from A to B of the edge AB


i.e. EAB = VB - VA ... in terms of the point vectors (w.r.t. the origin)
and let
U = vector from test point P to the distant point d.
Form the cross product U × EAB and find its z-component (U × EAB)z.
Then
• if (U × EAB)z > 0 the edge crosses R-to-L so set w = w +1
• if (U × EAB)z ≤ 0 the edge crosses L-to-R so set w = w -1
41

ii)
A Up
d

U
P
EAB
B

Here, we let Up denote the vector from B traversing the line vector U perpendicularly
i.e. it is a vector ┴U in the direction R-to-L when looking along U towards d. For
example if,
U = (ux,uy) then Up = (-uy,ux).
Now form the dot product, Up • EAB.
Then
• if Up • EAB > 0 the edge EAB is R-to-L
• if Up • EAB ≤ 0 the edge EAB is L-to-R

In general, regions or objects can be multiply connected – so we can use the above rules to
determine the interior/exterior or simply define which is which.

3.9.4 Polygon tables


In many packages, objects in a scene are constructed by polygon faces. Their geometric
descriptions (coordinate vertices, edges, surfaces) and attributes (colour, transparencies, light-
reflectivity etc.) are more conveniently held in tables. An example of an object with its
geometric data table representation is:
V1

E3 E6
E1
S1
S2
V5
E2 V3
E4 E5
V2
V4

Vertex Table Edge Table Surface Facet Table

V1: x1,y1,z1 E1: V1,V2 S1: E1,E2,E3


V2: x2,y2,z2 E2: V2,V3 S2: E3,E4,E5,E6
V3: x3,y3,z3 E3: V3,V1
V4: x4,y4,z4 E4: V3,V4
V5: x5,y5,z5 E5: V4,V5
E6: V5,V1
42

Other forms of the table may be used, such as one with an edge table that includes forward
pointers into the table of surface facets. This makes data checking and retrieval more efficient.

3.9.5 Equations for handling polygon facets or planes


Polygon facets are commonly employed in the construction of 3D objects. Since these are
parts of an (infinite) plane, we review some basic mathematical equations relating to planes
here.
The equation for a plane is:
Ax + By + Cz + D = 0 (3.40)

where ( x, y , z ) is a point on the plane and the A, B, C , D are fixed parameters identifying a
particular plane.
To solve for the parameters from (3.40), we must solve 4 linearly independent equations. It’s
customary to take D as an arbitrary ≠ 0 constant and then solve for A/D, B/D, C/D from

( A / D ) xk + ( B / D ) yk + (C / D ) zk = −1; k = 1, 2, 3 (3.41)

using 3 non-colinear points, such as, for example, the vertices ( xk , yk , zk ), k = 1, 2, 3 . Thus
setting
x1 y1 z1
D = − x2 y 2 z 2 (always ≠ 0 for any 3 distinct points) (3.42)
x3 y3 z3
we obtain
1 y1 z1 x1 1 z1 x1 y1 1
A = 1 y 2 z 2 , B = x2 1 z 2 , C = x2 y 2 1 (3.43)
1 y 3 z3 x3 1 z3 x3 y3 1

A = y1 ( z2 − z3 ) + y2 ( z3 − z1 ) + y3 ( z1 − z2 ) 
B = z1 ( x2 − x3 ) + z2 ( x3 − x1 ) + z3 ( x1 − x2 ) 

 (3.44)
C = x1 ( y2 − y3 ) + x2 ( y3 − y1 ) + x3 ( y1 − y2 ) 
D = − x1 ( y2 z3 − y3 z2 ) − x2 ( y3 z1 − y1 z3 ) − x3 ( y1 z2 − y2 z1 ) 

These relations for the plane parameters are also valid for planes passing through (0,0,0) i.e.
for D = 0 . They get updated as vertex and other information are updated.

In addition to the above it is also required to determine for a polygon surface or plane the
orientation of a plane surface. In order to do this we employ a normal vector to it:
43

Note from (3.40), that if ( A, B, C ) is the vector of the plane’s parameters A, B, C and its 3
successive vertices are ( xk , yk , zk ), k = 1, 2, 3 then the following dot products hold
( A, B, C ) ( x1 − x2 , y1 − y2 , z1 − z2 ) = 0 

( A, B, C ) ( x1 − x3 , y1 − y3 , z1 − z3 ) = 0  . (3.45)
( A, B, C ) ( x2 − x3 , y2 − y3 , z2 − z3 ) = 0 

Hence we can take a normal to the plane as the vector

N = ( A, B, C ) , (3.46)

where A, B, C are given by (3.44).

In many situations we need to know the “inside” and “outside” of a surface. To establish this we
apply the right-hand rule:
If the vertices bounding the surface are specified in a counter-clockwise direction, then N
points out from the top (which is defined as the outside or frontface), whilst −N points away
from the bottom (taken as the inside or backface).

N outside
V2
V1
V3 -V1
V3
V4

inside -N

Notice from the above that, another way to calculate a normal to the surface is to take the
cross-product of 2 vectors in the plane as follows (gives the correct sign for convex polygons).
Select 3 vertices along the boundary in the counter-clockwise direction e.g. V1,V2,V3 then

N = ( V2 − V1 ) × ( V3 − V1 ) (3.47)
Expanding the RHS into components then gives (check this!)

N = ( A, B, C ) (3.48)
44

namely, the plane constants! Then D can be found by substituting A, B, C from above and
one set of vertex coordinates into (3.40). Notice also, that using (3.48) the equation for the
plane (3.40) is simply,
N P = −D (3.49)
where P = ( x, y, z ) is an arbitrary point on it.
Finally, to test any point ( x, y , z ) in space, in a RH coordinate system, we observe that
Ax + By + Cz + D = 0 ⇒ point on surface 

Ax + By + Cz + D < 0 ⇒ point on inside  (3.50)
Ax + By + Cz + D > 0 ⇒ point on outside 

3.9.6 Scan line polygon fill algorithm


Required: Interior fill of a polygon such as:

scan
line

10 14 18 24

Strategy: For each scan line, find points of intersection with polygon edges, sort points L-to-R
and fill between each pair of points. For example for scan line above,
• fill between x=10 and x=14
• fill between x=18 and x=24

However, intersections with polygon vertices require special treatment. For example consider,

incr

decr
decr
1 1 y'
2
2
y
1 1 1
decr

Here,
45

• the scan line y' intersects 2+1+1 = 4 edges, passing through 1 vertex
• the scan line y intersects 2+1+1+1 = 5 edges, passing through 1 vertex.
Note that
• for the y' intersections the “interior” pixels are clear and unambiguous
• for the y intersections the “interior” pixels are not clear
and also that,
• for y', at common vertex 2 one edge is decreasing and the other increasing, i.e. 2
is a local minimum – so count this point as two vertices and fill between 2 pairs.
• for y, at common vertex 2 one edge is monotonically decreasing and so is the
other, i.e. 2 is a not a local min/max – so count this point as one vertex point
belonging to one edge only, and now can fill correctly between 2 pairs again.

To resolve polygon vertices as single or double ones, we follow the procedure below:
1. Process non-horizontal edges clockwise (or anti-clockwise)
2. For each intersection with a vertex check whether one edge has decreasing y-value at
its endpoint (wrt another before it) and the next edge has decreasing y-value at this
point (wrt another after it).
3. If true in 2, shorten the lower edge by 1 pixel y-value, ensuring only 1 point of
intersection with a scan line (~ y scan line case above).
If false in 2, leave vertex as 2 intersection points (~y' scan line case above).

Similarly, for anti-clockwise traversal, we shorten the upper edge.

In summary we have:
y increasing from e to e'
y decreasing from e' to e
shorten by 1
e' scan line e'
scan line
e y e y+1
y-1 y
shorten by 1

before after before after

The calculations can be reduced by taking into account coherence properties of a scene, i.e.
the relation between one part of it relative to the next part. For example, for the case,

( xk +1 , yk +1 ) scan line yk +1

( xk , y k ) scan line yk

in going from line yk to line yk +1 , since


46

slope = same (= m) at both points, can use


y − yk 1
m = k +1 =
xk +1 − xk xk +1 − xk
1
i.e. xk +1 = xk + to update x-values.
m
In fact, we can use only integer values for these since
∆y int
m= = , we can employ
∆x int

∆x
xk +1 = xk + .
∆y

Then to increment x-values along an edge we:-


i) initialize a counter to zero
ii) At each move up a scan line, increase the the counter value by ∆x
iii) when counter ≥ ∆y then
a. increase x-intersection by 1
b. decrease counter by ∆y
iv) Stop when upper endpoint reached and processed

For example:
Take an edge with slope m =7/3.
At the initial scan line set counter = 0 and here counter increment is ∆x = 3
Move up to next 3 scan lines giving counter values 3,6,9
On 3rd scan line counter > 7 ⇒
x-intersection coordinate increases by 1
counter reset to 9-7 = 2 (since now are 2 scan lines up)
Continue as above until we reach upper endpoint of the edge
(For negative slopes the process is similar)

In the following efficient algorithm implementing this scan line polygon fill we first store the
boundary as a sorted edge table. Then we proceed clockwise (or anti-clockwise) only, around
the edges, and use a bucket sort to store the edges sorted according to the smallest y-value of
each edge, in the correct scan line position. Only non-horizontal edges are stored in the table.
When the edges are processed, any intersection with a duplicate vertex point is resolved, by
shortening one edge as outlined above.

The data structure employed is shown below for a simple example. In the table, each entry for
a given scan line contains the maximum y-value for that edge and the x-intercept value (at its
lower vertex) and the inverse slope for that edge. For each scan line the edges are sorted in
the L-to-R order. The scan lines are processed from the bottom upwards, to obtain an active
edge list for each scan line intercepting the polygon boundaries. For more details see H&B p.
200.
47

scan line
number yC yB xC 1/mCB
.
B .
yD yC xD 1/mDC yE xD 1/mDE
.
.
C scan line yC
yA yE xA 1/mAE yB xA 1/mAB
C' E .
.
D
scan line yD 1
scan line yA 0
A

An example of a C code for scan filling a polygon is (from H&B):


/* A point in device coordinates */
typedef struct {
int x;
int y;
} dcPt;

/* Here we define the polygon vertices


dcPt pts[] = {
.................
};

typedef struct tEdge {


int yUpper;
float xIntersect, dxPerScan;
struct tEdge * next;
} Edge;

/* Inserts edge into list in order of increasing xIntersect field. */


void insertEdge (Edge * list, Edge * edge)
{
Edge * p, * q = list;
p = q->next;
while (p != NULL) {
if (edge->xIntersect < p->xIntersect)
p = NULL;
else {
q = p;
p = p->next;
}
}
edge->next = q->next;
q->next = edge;
}

/* For an index, return y-coordinate of next nonhorizontal line */


int yNext (int k, int cnt, dcPt * pts)
{
int j;
if ((k+1) > (cnt-1))
j = 0;
48

else
j = k + 1;
while (pts[k].y == pts[j].y)
if ((j+1) > (cnt-1))
j = 0;
else
j++;
return (pts[j].y);
}

/* Store lower-y coordinate and inverse slope for each edge. Adjust
and store upper-y coordinate for edges that are the lower member
of a monotically increasing or decreasing pair of edges */
void makeEdgeRec
(dcPt lower, dcPt upper, int yComp, Edge * edge, Edge * edges[])
{
edge->dxPerScan =
(float) (upper.x - lower.x) / (upper.y - lower.y);
edge->xIntersect = lower.x;
if (upper.y < yComp)
edge->yUpper = upper.y - 1;
else
edge->yUpper = upper.y;
insertEdge (edges[lower.y], edge);
}

void buildEdgeList (int cnt, dcPt * pts, Edge * edges[])


{
Edge * edge;
dcPt v1, v2;
int i, yPrev = pts[cnt - 2].y;

v1.x = pts[cnt-1].x; v1.y = pts[cnt-1].y;


for (i=0; i<cnt; i++) {
v2 = pts[i];
if (v1.y != v2.y) { /* nonhorizontal line */
// edge = (Edge *) malloc (sizeof (Edge));
edge = new Edge[sizeof(Edge)];
if (v1.y < v2.y) /* up-going edge */
makeEdgeRec (v1, v2, yNext (i, cnt, pts), edge, edges);
else /* down-going edge */
makeEdgeRec (v2, v1, yPrev, edge, edges);
}
yPrev = v1.y;
v1 = v2;
}
}

void buildActiveList (int scan, Edge * active, Edge * edges[])


{
Edge * p, * q;

p = edges[scan]->next;
while (p) {
q = p->next;
insertEdge (active, p);
p = q;
}
}
void fillScan (int scan, Edge * active)
{
Edge * p1, * p2;
int i,ix,iy;
49

p1 = active->next;
while (p1) {
p2 = p1->next;
for (i=(int)p1->xIntersect; i< (int)p2->xIntersect; i++)
setPixel(i, scan);
p1 = p2->next;
}
}

void deleteAfter (Edge * q)


{
Edge * p = q->next;

q->next = p->next;
delete p;
}

/* Delete completed edges. Update 'xIntersect' field for others */


void updateActiveList (int scan, Edge * active)
{
Edge * q = active, * p = active->next;

while (p)
if (scan >= p->yUpper) {
p = p->next;
deleteAfter (q);
}
else {
p->xIntersect = p->xIntersect + p->dxPerScan;
q = p;
p = p->next;
}
}

void resortActiveList (Edge * active)


{
Edge * q, * p = active->next;
active->next = NULL;
while (p) {
q = p->next;
insertEdge (active, p);
p = q;
}
}

void scanFill (int cnt, dcPt * pts)


{
Edge * edges[WINDOW_HEIGHT], * active;
int i, scan;

for (i=0; i<WINDOW_HEIGHT; i++) {


edges[i] = new Edge[sizeof (Edge)];
edges[i]->next = NULL;
}
buildEdgeList (cnt, pts, edges);
active = new Edge[sizeof(Edge)];
active->next = NULL;

for (scan=0; scan<WINDOW_HEIGHT; scan++) {


buildActiveList (scan, active, edges);
50

if (active->next) {
fillScan (scan, active);
updateActiveList (scan, active);
resortActiveList (active);
}
}
/* Free edge records that have been malloc'ed ... */
}

For a complete code with a GLUT window see scanfill.cpp.

3.9.7 Scan line filling of curved boundary areas


This is obviously more complicated than polygon filling. For instance finding boundary
intersections require the solution of nonlinear equations. For regular shapes like circles and
ellipses, we can generate the boundary points using the midpoint algorithms and fill between
them.

3.9.8 Filling Irregular shapes


Two approaches used here are:

The boundary fill algorithm (BFA)


If the boundary has a specified single colour, start with a point inside and proceed outward,
pixel by pixel until the boundary is encountered. This procedure is typically employed in
painting packages.
A primitive can be designed to accept an interior point (x,y), a fill colour and a boundary colour
as input data. Thereafter, we can employ schemes such as the:

a) 4-connect fill
Test 4 neighbouring pixels (U,D,L,R) and fill.
OK for simple shapes.

b) 8 connect fill
Test the above 4 pixels (U,D,L,R) +
4 diagonal pixels and fill.
OK for more complex shapes.

Note that the 4-connect scheme may fail to correctly fill certain shapes:

=start position after


51

However, the 8-connect scheme would resolve the above.

A recursive algorithm for the 4-connect boundary fill

void boundaryFill4 (int x, int y, int fillColor, int borderColor)


{
int interiorColor;
/* Get the current color at (x,y) */
getPixel (x, y, interiorColor);
if ((interiorColor != borderColor) && (interiorColor != fillColor)) {
setPixel (x, y); // Set color of pixel to fillColor.
boundaryFill4 (x + 1, y , fillColor, borderColor);
boundaryFill4 (x - 1, y , fillColor, borderColor);
boundaryFill4 (x , y + 1, fillColor, borderColor);
boundaryFill4 (x , y - 1, fillColor, borderColor)
}
}

The extension of this algorithm to an 8-connect one is straight forward (exercise!).


For a sample program that draws and fills a circle see WinCircleBFill4.cpp. This code outputs
to a WINAPI window rather than an OpenGL or GLUT window, and illustrates the use of the
WINAPI SetPixel and GetPixel functions.
Remark:
A major problem with the above procedures is that the stack size used in managing the
recursive calls becomes excessive for any but the smallest figure. A more efficient process
involves stacking only along scan lines, and just one point above and one point below it. That is
put the recursive calls inside a scan line loop. See H&B p. for details. Try to rewrite
WinCircleBFill4.cpp in order to incorporate this improvement.

The flood fill algorithm (FFA)


When the area to be filled is not bounded by a single colour, then the BFA cannot be used, e.g.
consider

fill
area

Then we can paint the interior by replacing a specified interior colour rather than searching for
a boundary colour. Again a 4-connect or 8-connect scheme can be used in an algorithm such
as:

A recursive algorithm for the 4-connect flood fill

void floodFill4 (int x, int y, int fillColor, int oldColor)


{
int interiorColor;
/* Get the current color at (x,y) */
52

getPixel (x, y, interiorColor);


if (interiorColor == oldColor) {
setPixel (x, y); // Set color of pixel to fillColor.
floodFill4 (x + 1, y , fillColor, oldColor);
floodFill4 (x - 1, y , fillColor, oldColor);
floodFill4 (x , y + 1, fillColor, oldColor);
floodFill4 (x , y - 1, fillColor, oldColor);
}
}

The full coding is similar to the BFA case. Once again, it is more efficient to employ scanline
stacking rather than using a straight 4-connect pattern.

3.10 Character generation primitives


Functions for producing letters of the alphabet, numerical digits, special symbols etc, of varying
styles and sizes are available in graphics packages. The characters are grouped into different
design styles or type-faces or “fonts” which are broadly classified as serif fonts (~ characters
with accents) and sans-serif fonts (~ characters with no accents), e.g.,

This is a serif font and This is a sans-serif font.

Two representation schemes for storing character fonts are those that:-
i) Use rectangular grid patterns for characters resulting in bitmapped fonts
ii) Use straight and curved line segments to form characters resulting in outline
fonts (as in Postscript)

In i) the character grid is mapped to a frame buffer segment called the font cache – requires
more storage since must allow for each variation (size, format).
In ii), less storage is required but variations have to be generated by manipulating the curve
segment definitions and then scan converting to the frame buffer – more time consuming.

Example: Representations of the letter in the two systems.

8 × 8 bilevel bitmap outline (straight+curved)

For more on this see the next chapter.


53

3.11 OpenGL functions

3.11.1 Fill-area functions


In OGL the functions for defining points and polygons by default fill these shapes with the
current colour. The polygon must be convex, with every angle < 180o. Options are allowed for
by 6 constants supplied as parameters to the glBegin(..) function. An additional set of functions
glRect*(..) such as glRecti(x1,y1,x2,y2) is provided for filled rectangles. The following are
some examples:

i) glRecti(200,100,59,250) gives

250
200
150
100
50

50 100 150 200

ii) With int vertex1[ ] = {200,100};


int vertex2[ ] = {50,250};
glRectiv(vertex1, vertex2);

the result is the same as in i). Note that the vertices are traversed in the clockwise
direction (x1,y1), (x2,y1), (x2,y2), (x1,y2) → ((x1,y1). For anti-clockwise traversal
we use the call glRectiv(vertex2, vertex1); This is required in procedures for
establishing the correct front/back face (see later).

iii) The OGL code with 6 (must be ≥ 3) vertex points, taken counter-clockwise,
will generate a closed polygon such as on RHS:
glBegin(GL_POLYGON);
glVertex2iv(p1); p6 p5
glVertex2iv(p2);
glVertex2iv(p3);
glVertex2iv(p4); p1 p4
glVertex2iv(p5);
glVertex2iv(p6);
glEnd(); p2 p3

iv) This code gives 2 unconnected triangles (note the vertices are now re-ordered!)
glBegin(GL_TRIANGLES);
p6 p5
glVertex2iv(p1);
glVertex2iv(p2);
glVertex2iv(p6); p1 p4
glVertex2iv(p3);
glVertex2iv(p4);
glVertex2iv(p5); p2 p3
glEnd();
54

v) But with this code we get the connected triangles.


glBegin(GL_TRIANGLE_STRIP); p6 p5
glVertex2iv(p1);
glVertex2iv(p2);
glVertex2iv(p6); p1 p4
glVertex2iv(p3);
glVertex2iv(p4);
glVertex2iv(p5); p2 p3
glEnd();

vi) This code generates a triangle fan.


glBegin(GL_TRIANGLE_FAN); p6 p5
glVertex2iv(p1);
glVertex2iv(p2);
glVertex2iv(p3); p1 p4
glVertex2iv(p4);
glVertex2iv(p5);
glVertex2iv(p6); p2 p3
glEnd();

(Note in both above two, for N vertices we obtain N-2 triangles


provided no vertex position is repeated in the GL code block)

vii) Here we obtain two quadrilaterals with the code (4 successive vertices form a
quad)
glBegin(GL_QUADS);
glVertex2iv(p1); p1 p4 p5 p8
glVertex2iv(p2);
glVertex2iv(p3);
glVertex2iv(p4);
glVertex2iv(p5);
glVertex2iv(p6); p2 p3 p6 p7
glVertex2iv(p7);
glVertex2iv(p8);
glEnd();

viii) This is how we obtain a set of connected quads:

glBegin(GL_QUAD_STRIP);
glVertex2iv(p1); p1 p4 p5 p8
glVertex2iv(p2);
glVertex2iv(p4);
glVertex2iv(p3);
glVertex2iv(p5);
glVertex2iv(p6); p2 p3 p6 p7
glVertex2iv(p8);
glVertex2iv(p7);
glEnd();
55

Remark:
In all the above, the OGL filled area primitives required convex polygons. However, in the GLU
library, concave polygons (with linear boundaries) are allowed. In addition GLU provides
tessellation routines for converting such shapes into triangles, triangle meshes, fans and
straight-line segments. The latter can then be processed by the OGL functions.

3.11.2 Open GL vertex arrays


Constructing a scene in general requires the specification of many component objects, each
with its own coordinate position and face definitions. OpenGL uses vertex arrays and other
data structures to handle these conveniently. To understand the process, consider as an
example, the unit cube with its vertices labelled as shown on the right:

z z
1 4 5
6 7
0 y 0 y
1 1
1 2
3
x x

The vertex coordinates can be specified by a double-subscripted array:


GLint points[8][3] = { {0,0,0}, {0,1,0}, {1,0,0}, {1,1,0}, {0,0,1}, {0,1,1}, {1,0,1}, {1,1,1} };

Alternatively, we can use a 3D vertex structure such as


typedef GLint vertex3[3];
vertex3 pt[8] = { {0,0,0}, {0,1,0}, {1,0,0}, {1,1,0}, {0,0,1}, {0,1,1}, {1,0,1}, {1,1,1} };

where each pt[ ] index corresponds to a particular vertex label, indicated above.
In addition, we define the 6 faces of the cube, with either glBegin(GL_POLYGON) or
glBegin(GL_QUADS), listing the vertices in counter-clockwise direction when “looking from the
outside”:
void quad(GLint n1, GLint n2, GLint n3, GLint n4)
{
glBegin(GL_QUADS);
glVertex3iv(pt[n1]);
glVertex3iv(pt[n2]);
glVertex3iv(pt[n3]);
glVertex3iv(pt[n4]);
glEnd();
}

void cube( )
{
quad(6,2,3,7);
quad(5,1,0,4);
quad(7,3,1,5);
quad(4,0,2,6);
quad(2,0,1,3);
quad(7,5,4,6);
}
56

Judging from the above, in realistic scenes many function calls would be needed (thousands is
typical). Many shared vertices would be repeatedly defined. To take care of such issues
efficiently, and to reduce the total number of function calls, vertex arrays are employed.
The procedure to follow is:-
1. Enable the vertex-array feature in OpenGL by calling:
glEnableClientState(GL_VERTEX_ARRAY)
2. Set up the location and data format for the vertex coordinates by calling:
glVertexPointer( )
3. Render the scene, processing multiple primitives by calling a function such as:
glDrawElements( )

A sample code for the cube above is (using the previous definition of pt[ ] ):

glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_INT, 0, pt);
GLubyte vertIndex[ ] = {6,2,3,7,5,1,0,4,7,3,1,5,4,0,2,6,2,0,1,3,7,5,4,6};
glDrawElements(GL_QUADS, 24, GL_UNSIGNED_BYTE, vertIndex);

Note here:
• The 1st call, sets up a state, the vertex-array state, on the client machine
• The 2nd call sets up the location and format of the coordinates of the object’s vertices:
3 = three coordinates for each vertex, GL_INT = data type for each, 0 = offset between
elements (>0 allows for more info to be packed into array, here only coordinates so 0),
pt = vertex array holding coordinates.
• The 3rd call sets up the vertex index array vertIndex[ ] holding the vertex indices as
unsigned bytes, corresponding to the faces to be processed.
• The 4th call processes the cube faces, using 4 vertices at a time: the parameters
GL_QUADS => quadrilaterals, 24 = size of vertIndex or number of indices,
GL_UNSIGNED_BYTE = type of index.

Considerably more information can be packed into a vertex-array.

3.11.3 Open GL pixel-array primitives


A pixel-colour array is a structure that contains a rectangular arrangement (matrix) of colours.
Such a matrix can arise from the scan of a photograph or can simply be generated by a
program. When an array element is either on (light) (~ 1) or off (dark) (~ 0) the array is called
a bitmap (also a mask). For an array with elements of general colour, we have a pixmap.
Typically, one would want to map a pixmap onto a region on the screen. For an example of a
bitmap, recall the character bitmap, discussed in the previous section. OGL provides two
functions to set up bitmaps and pixmaps.

OpenGL bitmap function


To set up a binary array, say bitShape, we use:

glBitmap(width, height, x0, y0, xOffset, yOffset, bitShape);

Here,
• width = no. of columns, height = no. of rows of bitShape
• each element of bitShape is either 0 or 1. 1 => corresponding pixel should be lit
57

• x0, y0 = floating point nos ~ defined “origin” (relative to lower left corner of bitShape)
array
• Location in frame buffer where pattern is to be applied is called the current raster
position. The bitmap is placed with its origin (x0,y0) at this position.
• Values xOffset, yOffset (also floating point) used to update frame-buffer current raster
position after bitmap is displayed.

The current raster position may be set with a call to one of the functions
glRasterPos*(..);
similar to the glVertex*(..) functions. For more on this topic see H&B p. 144.

Similarly, a pattern defined in the colour array pixMap may be applied to a block in the frame
buffer with a function call like:

glDrawPixels(width, height, dataFormat, dataType, pixMap);

For example, the following displays a pixmap colorShape, of size 128 × 128 with elements
holding RGB colour values of type unsigned byte:

glDrawPixels(128, 128, GL_RGB, GL_UNSIGNED_BYTE, colorShape);

It’s possible to paste the pixmap array into a buffer, such as a depth buffer (stores object
distances from a viewing position) or a stencil buffer (stores boundary patterns for a scene).
To do this set the dataFormat parameter in the above to GL_DEPTH_COMPONENT or to
GL_STENCIL_INDEX.
Various buffers are available in OpenGl for such purposes as stereoscopic viewing (left/right
buffers), animation (double buffering) and combinations of these for use in screen refreshing.
Non-refresh buffers to hold user data can be defined too. A buffer may be selected for storing a
pixmap with a call such as:

glDrawBuffer(buffer);

etc (see H&B p. 146).

OpenGL functions for raster operations


A raster operation or raster op is a procedure for manipulating a pixel array. Some raster ops
are:
• block transfers - bit-block-transfers or bitblt for bilevel arrays and pixblt for
multilevel arrays

• To read a block of pixels from the lower left screen coordinates (xmin,ymin) and put
into array with size width and height and with dataFormat and dataType as before
we use
glReadPixels(xmin, ymin, width, height, dataFormat, dataType, array);

• A buffer may be selected before reading by


glReadBuffer(buffer);
Symbolic constants may also be used to specify a buffer in place of the variable
buffer.
58

• A block of pixels may be copied from a source buffer to a destination buffer by:
glCopyPixels(xmin, ymin, width, height, pixelValues);
where the lower left screen coordinates are (xmin,ymin), the size is width × height
and pixelValues = GL_COLOR, GL_DEPTH or GL_STENCIL to indicate the type of
buffer.

• In the above we must previously select buffers by glReadBuffer( ) for the source and
glDrawBuffer( ) for the destination.

• Buffer data processed by glDrawPixels( ) and glCopyPixels( ) may be operated on


before storage in various ways, such as logical AND, OR, EXCLUSIVE OR –ing them
by for example,
glEnable(GL_COLOR_LOGIC_OP);
glLogicOp(logicOp);
where logicOp = one of GL_AND, GL_OR, GL_XOR, GL_INVERT, GL_CLEAR,
GL_SET,...

Other routines are available for pixel data processing (see H&B p. 147).
colour

3.11.4 Open GL character functions


As shown earlier, a character may be defined as a bitmap and then we can store the set of
bitmap characters as a font list. Any text string can be mapped from the list to consecutive
positions in the frame buffer.
The GLUT library has predefined character sets – so the need to define your own is rare.
To display a bitmap character we use:

glutBitmapCharacter(font, character);
where font takes a symbolic GLUT constant such as for examle, GLUT_BITMAP_8_BY_13 (a
fixed width font) or GLUT_BITMAP_TIMES_ROMAN_10 (a proportionally spaced font) and
character may be specified as an ASCII code (e.g. 65) or in this instance ‘A’.
With this function each character is displayed with its lower left bitmap corner at the current
raster position. After loading the character into the frame buffer, an offset, the size of a
character width advances the raster position. Thus a text string text[36] can have its characters
consecutively displayed by code such as:

glRasterPos2i(x,y);
for (int k=0; k < 36; k++)
glutBitmapCharacter(GLUT_BITMAP_9_BY_15, text[k]);

Outline characters are generated with the polyline(GL_LINE_STRIP) boundaries and displayed
by:
glStrokeCharacter(font, character);

where font can take values such as GLUT_STROKE_ROMAN (a proportionally spaced font) or
GLUT_STROKE_MONO_ROMAN (a font with constant spacing).

3.11.5 Open GL display lists


An efficient means to store an object description or a sequence of commands for generating
one is a display list. Such a structure can be accessed many times, and even over a network
59

wherein the list may reside on a server. Complex scenes are more easily constructed when
object descriptions are held in display lists.
Creating an OpenGL display list
The command syntax used is:
glNewList(listID, listMode);
....
....
glEndList( );

Here, a +ve integer value is assigned to listID, and listMode takes the value GL_COMPILE
(saves list for later execution) or GL_COMPILE_AND_EXECUTE (to execute the commands as
they are entered in the list, as well as keep for later execution).
Note that when the list is compiled, parameters such as coordinates and colours become set
and cannot be subsequently altered. Thus, commands such as OpenGL vertex-list pointers
should not be included in the list.
The listID is used in the call to execute the list, so to avoid accidental duplication it’s better to
generate an unused ID with the call
listID = glGenLists(1); //assigns an unused +ve integer to listID.

To query whether, a particular integer listID has been used as an ID we can use
glIsList(listID); //returns GL_TRUE or GL_FALSE

Executing an OpenGL display list


Done with the call,
glCallList(listID);

Example:
The following code segment sets up a display list to generate a regular hexagon around the
circumference of a circle and then executes the list with glCallList().

const double TWO_PI = 6.2831853;


GLint x,y,k;
....
regHex = glGenLists(1); // Get an identifier for the display list.
glNewList(regHex, GL_COMPILE);
glColor3f(1.0, 0.0, 0.0); // Set fill color for hexagon to red.
glBegin(GL_POLYGON);
for (k = 0; k < 6; k++) {
theta = TWO_PI * k / 6.0;
hexVertex.x =(GLint)(circCtr.x + 150 * cos (theta));
hexVertex.y = (GLint)(circCtr.y + 150 * sin (theta));
glVertex2i (hexVertex.x, hexVertex.y);
}
glEnd();
glEndList();
....
glCallList(regHex);

For a complete program implementing this code see DispLstHexagon.cpp. Multiple display
lists may also be processed – see H&B p. 152.
60

CHAPTER FOUR: ATTRIBUTES OF PRIMITIVES

Any parameter affecting the way in which a primitive is to be displayed is called an “attribute”
parameter. For example, colour and size are fundamental attributes. Attributes are handled in
two ways:
i) We extend the parameter list of the primitive function to include attribute data
ii) We maintain a system list of current attributes via separate routines, which is
checked whenever any primitive function is called, before displaying the primitive.
OpenGl uses the latter approach. Such a system which maintains its own attribute list is
referred to as a state system or state machine. The attribute parameters and other related
information such as the current frame-buffer position are called state variables or parameters.

4.1 OpenGL state variables


Amongst these are: colour, the current matrix mode, the model-view matrix, the current
frame-buffer position and scene lighting parameters. They have default values which
remain fixed for any primitive call, but can be set by attribute function calls. Thereafter only the
subsequent primitive calls are affected.

4.2 Colour and gray scale levels


The colours available depend on a particular system – large range for raster systems, small for
random scan systems. Generally, colour values are coded from 0... a positive integer. The
codes are then converted to intensity level settings for the electron beams.

Colours codes can be stored directly in the frame buffer or put into a separate table; then the
corresponding binary pattern for each pixel is put into the frame buffer. Using for example, 3
bits per table, the number of colours is 8, with colours coded as follows in a table:

Colour code stored in FB displayed


R G B colour
0 0 0 0 black
1 0 0 1 blue
2 0 1 0 green
3 0 1 1 cyan
4 1 0 0 red
5 1 0 1 magenta
6 1 1 0 yellow
7 1 1 1 white

Here each bit puts a particular colour gun either on (1) or off (0) in an RGB system. Adding
more bits for each colour, of course increases the colour range. For example, for 6-bits per
pixel, we can use 4 intensity levels for each gun (00, 01, 10, 11), i.e there are 4×4×4 = 64
possible colours for a pixel. For a resolution of 1024 × 1024 a full-colour (24 bits-per-pixel)
RGB system requires 3 megabytes storage for the FB.
When FB space is limited, selected colours from the full-colour set may be held in a look-up
table where each colour is accessed by an index and where only such indices are held in the
FB. Recall section 2.2.1 (p. 11) for an example.

When a monitor is not capable of colour output we can use a table corresponding to shades of
gray. For example, intensity codes for a 4-level gray scale system would be organised as:
61

Intensity level stored in FB displayed


int or bin code gray shade
0.00 0 0 0 black
0.33 1 0 1 dark gray
0.67 2 1 0 light gray
1.00 3 1 1 white

Note that any intensity near 0.33 would be mapped to int 1 (or binary 01) and so on.
Alternatively, each intensity level may be converted to a voltage level on the output device
directly.

4.3 OpenGL colour functions


The color display mode in GLUT is set as follows:

glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB);

where the 1st parameter specifies a single buffer for the FB, and the 2nd specifies the RGB
mode. Using GLUT_INDEX will allow choice from an index table and GLUT_RGBA the RGBA
(enhanced) mode.
RGB and RGBA modes
The RGBA mode allows for an (optional) additional alpha coefficient to control color blending.
This is useful when primitives overlap over a pixel area, and one primitive can be set to allow
for different transparency (or opacity) levels. The current colour components are set by calling
glColor*(colorComponents);
where * is 3 (for RGB) or 4 (RGBA) followed by f or i or s or d and/or by v as in glVertex*().
Examples:
glColor3f(1.0, 0.0, 0.0); //red
glColor3fv(colorArray); //depends on vector colorArray
glColor4f(1.0, 1.0, 1.0, 1.0); //white with alpha = 1.0. OGL default
glColor3i(0, 255, 255); //for 8-bits per pixel for each color

Color-index mode
OpenGL color-index mode is set with
glIndex*(colorIndex); //colorIndex = +ve no index to table
Example: glIndexi(196); //196 held in current (x,y) FB position

OGL cannot make a LU table, but accesses window system’s. But GLUT allows user to set
values there by,
glutSetColor(index, red, green, blue);
will allow float RGB values (0.0 - 1.0) to be loaded into the table at position index.
OGL and its extension libraries have routines, comprising its Imaging Subset, for setting up
color tables. These can be enabled by the glEnable( ) function. See H&B p. 179.

Color blending in OpenGL


Instead of replacing a current colour in a FB position (called the destination colour), with a
second (called the source) we sometimes want to combine or blend them. This process is used
to simulate paintbrush effects and in modelling transparency. In OGL, blending is available only
in RGB and RGBA modes and is enabled or disabled by:
glEnable(GL_BLEND); //switch it on
glDisable(GL_BLEND); //switch it off
62

The new blended colour is computed in RGBA from the expression


( S r Rs + Dr Rd , S g Gs + Dg Gd , Sb Bs + Db Bd , Sa As + Da Ad ) (4.1)
where ( Rs , Gs , Bs , As ) is the RGBA source colour vector, ( Rd , Gd , Bd , Ad ) is the destination
component vector and ( S r , S g , Sb , Sa ) are source blending factors and ( Dr , Dg , Db , Da )
destination blending factors. The latter are selected with
glBlendFunc(sFactor, dFactor);
See H&B p. 180 for more details.

Colour arrays in OpenGL


Colour values may also be combined with the coordinate values in a vertex array for a scene.
First the colour-array features are enabled by:
glEnableClientState(GL_COLOR_ARRAY);
Then, for RGB mode we use,
glColorPointer(nColorComponents, dataType, offset, colorArray);
where nColorComponents = 3 for RGB or 4 for RGBA. dataType = GL_INT or GL_FLOAT
depending on the type of values in colorArray. For a separate color array, offset = 0, but when
combined with vertex data, offset = difference in bytes between each set of color components
in the array.
For examples see H&B p. 181.

Other functions in OpenGL


The function call that selects the RGBA components for a display window is:
glClearColor(red, green, blue, alpha);
where each parameter is a float: 0.0 ...1.0. The alpha allows for blending with a previous
colour, if blending has been enabled.
Since there are several colour buffers, we apply the above selection to them by
glClear(GL_COLOR_BUFFER_BIT);
For index-color mode we first use
glClearIndex(index);
to select the colour in the colour table at position index. We then use glClear( ) as above.
More functions are discussed in the later Chapters.

4.4 Line attributes


The basic attributes here are type, width and colour.
Line type
Possible types here could be solid lines, dashed lines, dotted lines etc.
Can modify the line drawing algorithm to set
• length of solid sections
• spacing between solid sections
Then subsequent calls to the line primitive will produce lines of the selected type.
For example, a dashed line can be displayed on raster systems, incorporating this line type
attribute, by plotting pixel spans such as:
63

pixel spans

skipped spaces

We can specify the span length and inter-span spacing by setting a pixel mask = a string of 0’s
and 1’s e.g.
1111000 => dashed line with spans of 4 pixels and spacing of 3 pixels. On a
bilevel system each bit = a bit value loaded into the frame buffer, and the computed bits are
then and-ed with these. Producing dashed lines with a fixed number of pixels can give unequal
lengths depending on the slopes:
normal larger by √2

Thus, lengths would have to be adjusted according to the slopes.

Line width
The implementation of the line-width attribute depends on the capabilities of the output device.
For example, a heavy line on a video monitor could be adjacent parallel lines, but on a pen
plotter it would require a change of pens.
In raster systems for slopes |m| < 1 In raster systems for slopes |m| > 1
Plot pixels at (x,y) and (x,y+1) for line Plot pixels at (x,y),(x-2,y),(x-1,y),(x+1,y)
width=2. i.e have vertical extension. for line width=4 i.e. horizontal extension

This technique however gives lines of unequal thickness for different slopes.
Line endpoints
64

Can also be used to distinguish styles for thick lines. Some common endpoint styles in graphics
packages are:-

butt caps round caps projected square caps

For thick line or polygon joints we can also employ styles such as:-

mitre join round join bevel join

Another approach to line thickening, for wide lines, is to use filled rectangles for them.

Pen and brush options


Some packages provide drawing and painting capabilities. In these pen and brush options
(shape, size, pattern) are made available. Pixel masks can be designed to implement these
(see H&B p. 187).

Curve attributes
Methods similar to those for straight lines are employed here. See H&B p. 189.

4.5 OpenGL attribute functions


Points
The point size may be set with the function call
glPointSize(size); //size = float such as 1.0, 2.0, 3.0, ...
where size=1.0 will display 1 pixel, size = 2.0 will display 2×2 pixels etc. After this call all points
will be displayed with the selected size.
Example:
glColor3f(1.0,0.0,0.0);
glBegin(GL_POINTS);
glVertexi(50,100); //standard size red dot
glPointSize(2.0);
glColor3f(0.0,1.0,0.0);
glVertexi(150,200); //double size green dot
glEnd();
65

Line width
Set with the call
glLineWidth(width); //width = 1.0, 2.0, 3.0 etc

Line style
Set with the call
glLineStipple(repeatFactor, pattern);

where,
• pattern = 16-bit integer ~ display pattern (default 0xFFFF = all bits 1 => solid line,
0x00FF => 8 bits OFF and 8 bits ON => dashed line).
• repeatFactor = number of times each bit in pattern is repeated

For example, glLineStipple(1, 0x00FF) will give dashed lines with solid segment length = 8
pixels and spacing 8 pixels. However, before this function takes effect the line-style feature
must have been enabled:
glEnable(GL_LINE_STIPPLE);
glLineStipple(repeatFactor, pattern);
......
glDisable(GL_LINE_STIPPLE); //reduce to default

Example:
The following code can be used to output 3 line graphs, each of a different style.

/* Define a two-dimensional world-coordinate data type. */


typedef struct { float x, y; } wcPt2D;
wcPt2D dataPts [5];
void linePlot (wcPt2D dataPts [5])
{
int k;
glBegin (GL_LINE_STRIP);
for (k = 0; k < 5; k++)
glVertex2f (dataPts [k].x, dataPts [k].y);
glFlush ( );
glEnd ( );
}

/* Invoke a procedure here to draw coordinate axes. */


glEnable (GL_LINE_STIPPLE);
/* Input first set of (x, y) data values. */
glLineStipple (1, 0x1C47); // Plot a dash-dot, standard-width polyline.
linePlot (dataPts);
/* Input second set of (x, y) data values. */
glLineStipple (1, 0x00FF); // Plot a dashed, double-width polyline.
glLineWidth (2.0);
linePlot (dataPts);
/* Input third set of (x, y) data values. */
glLineStipple (1, 0x0101); // Plot a dotted, triple-width polyline.
glLineWidth (3.0);
linePlot (dataPts);
glDisable (GL_LINE_STIPPLE);
66

Exercise: Put this into a GLUT program and test it.

Other OpenGL line effects


It’s possible to vary the colour along a line segement, by linear-interpolating the colour of its
two endpoints:
glShadeModel(GL_SMOOTH); //param is default anyway
glBegin(GL_LINES);
glColor3f(0.0,0.0,1.0);
glVertexi(50,50);
glColor3f(1.0,0.0,0.0);
glVertexi(250,250);
glEnd();
The call glShadeModel(GL_FLAT) will result in a single colour, namely the second endpoint’s
(red). Other effects can be achieved – see H&B p. 193.

4.6 Fill-area attributes


Fill styles
The following basic styles are typical of those available in most graphics packages:

hollow solid patterned hatch fill

A fill pattern is usually defined in a rectangular array of colours, one colour for a position in the
array. Another way involves the use of a bit array which acts as a mask to be applied to the
display area. The process of filling an area with a rectangular pattern is known as tiling. For
more details see H&B p. 194.

Colour blended fills


Fill patterns may be combined with background patterns in essentially two ways:
• logical and-ing, or-ing, xor-ing or replacing
• blending of fill colour, F = (FR,FG,FB) with background colour, B = (BR,BG,BB): typically,
the current RGB colour P = (PR,PG,PB) is computed as the convex linear combination

P = tF + (1-t)B, where 0≤ t ≤1 is a transparency parameter (4.1)

Clearly for t < 0.5, the background colour predominates. This process is referred to as
soft filling. Other variations of (4.1) are also employed.

4.7 OpenGL fill-area attribute functions


These work on convex polygons; however, regions of general shapes can be broken down to
convex polygons, namely quads and triangles. The procedure for filling areas follows the
steps:-
• define a fill pattern
• enable the polygon fill routine to make this pattern current
• enable the polygon fill feature of OpenGL
• call the routines to construct the polygons to be filled
67

The polygon fill pattern will be shown up to and including the boundaries, unless the latter is
specifically set otherwise. Also, by default, the drawn polygon would be filled in solid colour.

The pattern-fill function


First we define a 32-bit × 32-bit fill mask where bit 1 => replace with current color, bit 0 =>
leave as is. Then the fill pattern is defined with hex elements, by for example,

GLubyte fillPattern[ ] = {0xff, 0x00, 0xff, 0x00, ....};

The bit order starts from the bottom row up to the top-most row of the patern array (same order
as bitShape function). Then this pattern is enabled across the entire display window, from its
lower left corner to the top right corner. Wherever a specified polygon intersects it, the polygon
would be shown with this fill pattern.
The mask is used to set the current fill pattern with,
glPolygonStipple(fillPattern);
and then the fill routine is enabled with
glEnable(GL_POLYGON_STIPPLE);
and only then the polygon to be filled is constructed.
The pattern filling is disabled with
glDisable(GL_POLYGON_STIPPLE);

Texture and interpolation patterns


Texture patterns which give the appearance of realsitic surfaces such as wood, brick, stone,
steel etc may also be used to fill polygons – see a later chapter.

In addition, an interpolation fill, like we had for lines may be used. For example, consider the
code:
glShadeModel(GL_SMOOTH);
glBegin(GL_TRIANGLES);
glColor3f(0.0,0.0,1.0);
glVertex2i(50,50);
glColor3f(1.0,0.0,0.0);
glVertex2i(250,250);
glColor3f(0.0,1.0,0.0);
glVertex2i(350,50);
glEnd();

This will produce the shaded triangle (run TriangleShaded.cpp) where the colour is a linear
interpolation of the 3 vertex point colours:
68

Note that the option glShadeModel(GL_FLAT) will result in the single last colour (green).

OpenGL wire-frame functions


Only polygon edges (wire-frame displays) or even just polygon vertex positions may be
displayed with:
glPolygonMode(face, displayMode);
where,
• face = one of GL_FRONT, GL_BACK, GL_FRONT_AND_BACK
• displayMode = GL_LINE (edges only), GL_POINT (vertices only), GL_FILL
(default anyway)

To obtain different fill and edge/boundary colours we proceed as in the stub example:

glColor3f(0.0, 1.0, 0.0);


//call polygon generating routine here
glColor3f(1.0, 0.0, 0.0);
glPolygonMode(GL_FRONT, GL_LINE );
//call polygon generating routine again
.....

For 3-D polygons, the above filling methods may produce gaps along edges. This effect is
known as stitching and is due to the different ways in scan-line filling and the edge-line
drawing algorithms meet at common pixels. One way to eliminate it is to shift the depth values
(distances from the X-Y plane) computed by the fill algorithm to avoid overlapping with the
edge depth values for the polygon. This is done by:-

glEnable(GL_POLYGON_OFFSET_FILL);
glPolygonOffset(factor1, factor2);

where, the 1st function invokes the offset routine for the scan-line filling, and, the 2nd sets the
float parameters factor1 and factor2 used to compute the amount of depth offset by the
formula:
depthOffset = factor1× maxSlope + factor2 × const (4.2)

Here maxSlope = the maximum slope in the polygon and typically factor* = 0.75 or 1.0 (some
experimentation may be needed here).
The previous code segment would then take the form:

glColor3f(0.0, 1.0, 0.0);


glEnable(GL_POLYGON_OFFSET_FILL);
glPolygonOffset(1.0, 1.0);
//call polygon generating routine here
glDisable(GL_POLYGON_OFFSET_FILL);

glColor3f(1.0, 0.0, 0.0);


glPolygonMode(GL_FRONT, GL_LINE );
//call polygon generating routine again
.....

Then the interior fill of the polygon is pushed a little back in depth, so that there is no
interference with the depth values of the edges. For other methods to handle stitching see H&B
p. 209.
69

To display a concave polygon we first split it into convex ones, in particular triangles. Then,
when the triangles are filled, but we must remove the common edges for wire-frame rendering.
For example, here,

dashed
lines
not
wanted

we can set a bit flag to GL_FALSE (to indicate a vertex is not connected to another by a
boundary edge) for a particular triangle. Thus for the triangle shown below, we obtain the
situation on its RHS,

v1 v1

v2 v2
v3 v3

with the code segment:

glPolygonMode(GL_FRONT_AND_BACK, GL_LINE);
glBegin(GL_POLYGON);
glVertex3fv(v1);
glEdgeFlag(GL_FALSE);
glVertex3fv(v2);
glEdgeFlag(GL_TRUE);
glVertex3fv(v3);
glEnd();

Polygon edge flags can also be incorporated into vertex arrays (see H&B p. 210).

OpenGL front-face function


The default ordering of vertices is counter-clockwise (GL_CCW), making the face a front face.
But, any face may be defined to be a front face for a clockwise ordering by

glFrontFace(vertexOrder); //with vertexOrder = GL_CW

4.8 Character attributes


For a discussion of these see H&B p. 211. In OpenGL characters may be designed using the
bitmap functions or we can make use of the GLUT character generating routines. The latter can
set attributes either for bitmap characters or stroke character sets. See H&B p. 214.

4.9 Anti-aliasing
Displayed images on raster systems are generally jagged since the final coordinates are
discrete (integers). This can cause distortion due to low-frequency or under-sampling, resulting
in the so-called aliasing error. For example consider the periodic shape or signal:
70

periodic shape

sampling positions

After sampling on a grid with wider separation (lower-frequency) we obtain the waveform:

lower frequency periodic shape

To avoid losing information from periodic objects we must set:


sampling frequency ≥ 2 × highest frequency occurring in the object
≡ Nyquist sampling frequency
Thus we require,
fs = 2fmax, at least. (4.3)
Equivalently,
sampling interval ≤ ½ cycle interval
≡ Nyquist sampling interval
For x-interval sampling this criterion can be stated as:
∆xs = ½ ∆xcycle, at most. (4.4)

One possible way to increase the sampling rate is to use a higher-resolution display. But, the
jaggedness is not eliminated entirely. Also, this requires a larger frame buffer – too large a
frame buffer means that the required 30-60 frames/sec refresh rate cannot be achieved.

Other ways involve varying pixel intensities along the boundaries, which is suitable for raster
systems with > 2 intensity levels (colour or grayscale). Methods used to implement this idea
are:-
• Increase the sampling rate by treating the screen as if it is covered with a finer grid
than the actual one. Then use multiple sampling points across the finer grid to
determine the actual screen pixel intensities i.e. sample at higher resolution but display
at lower resolution – called super-sampling or post-filtering
• Find the pixel intensities by calculating areas of overlap of each pixel with objects to be
displayed, by determining where object boundaries intersect individual pixel
boundaries – called area sampling or pre-filtering
• Shifting the display location of a pixel areas by micro-positioning the electron beam
in relation to the object geometry – called pixel phasing – depends on device’s
hardware capabilities.

For more technical detail on each of the above see H&B p. 215 – 221.
71

4.10 OpenGL anti-aliasing functions


These are activated by calling:-
glEnable(primitiveType);
where primitiveType is one of:
GL_POINT_SMOOTH, GL_LINE_SMOOTH, GL_POLYGON_SMOOTH.
Then for RGBA colour blending we call:
glEnable(GL_BLEND);
Next we apply the previous colour blending theory by calling:
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_ARC_ALPHA);

Another way involves setting up a colour ramp table, in gradations of the object colour to the
background colour (see H&B p. 222).

4.10 OpenGL query functions


The current state parameter values and attribute settings may be obtained by calling query
functions. The latter copy the values into an array for later use.
These functions take the forms
glGetBooleanv(..), glGetIntegerv(..), glGetFloatv(..), glGetDoublev(..)
where (..) allows for 2 parameters.
For example,
glGetFloatv(GL_CURRENT_COLOR, colorValues);
obtains the current RGBA colour settings and puts them into the array colorValues of type
float.
Other values for the 1st parameter include:
GL_POINT_SIZE, GL_LINE_WIDTH, GL_CURRENT_RASTER_POSITION.

The function call,


glIntegerv(GL_RED_BITS, redBitSize);
returns in the array the number of red bits in each of the frame buffer, depth buffer,
accumulation buffer and stencil buffer.
Many other attributes may be queried.

4.11 OpenGL attribute groups


The collection of attributes and other OpenGL state parameters are put into attribute groups,
with related parameters forming a group. For example the point-attribute group contains the
point size and point-smooth (anti-aliasing) parameters. The polygon-attribute group contains
eleven parameters (fill-pattern, front-face flag, polygon-smooth status, ...).

All parameters in a group may be saved onto an attribute stack with,


glPushAttrib(attrGroup);
where attrGroup = a symbolic constant identifying the group, e.g. GL_POLYGON_BIT.

Combined group parameters may put onto stack with a logical or ( | ) combined parameter
e.g.,
glPushAttrib(GL_POINT_BIT | GL_POLYGON_BIT);

All parameters may be reinstated from the stack by


72

glPopAttrib( );
with no parameters, since all values put onto the stack are returned to the state machine.

CHAPTER FIVE: GEOMETRIC TRANSFORMATIONS

In any graphics package we need to have routines that allow us to move objects from one
position to another or to view scenes from different angles or vantage points. These issues are
resolved by applying mathematical geometric transformations.

5.1 Two-dimensional geometric transformations


Here we consider procedures for
i) translating ii) rotating iii) scaling (resizing)
two-dimensional objects.

Translation
A 2D point ( x, y ) is translated to the point ( x′, y ′) by applying a shift (or translation) vector
(t x , t y ) to it as follows:
x′ = x + t x , y′ = y + t y (5.1)

In column vector notation with


x   x′  t x 
P =  , P′ =   , T =   (5.2)
 y  y ′ t y 

we can write this as P′ = P + T . (5.3)

Note that translation is a rigid body transformation i.e. objects are moved without
deformation or without altering the relative positions of points within the body.

To translate an entire straight line, we can translate the endpoints and then re-draw the line
between the new points.

To translate a polygon we
• translate each vertex
• re-draw its new line segments between the vertex pairs

old
new

P P′

To translate a circle/ellipse we translate the centre, and re-draw the figure. Similar processes
can be used for other figures.
73

An OpenGL code for translating a polygon is:

class wcPt2D {
public:
GLfloat x, y;
};

void translatePolygon (wcPt2D * verts, GLint nVerts, GLfloat tx, GLfloat ty)
{
GLint k;

for (k = 0; k < nVerts; k++) {


verts [k].x = verts [k].x + tx;
verts [k].y = verts [k].y + ty;
}
glBegin (GL_POLYGON);
for (k = 0; k < nVerts; k++)
glVertex2f (verts [k].x, verts [k].y);
glEnd ( );
}

To delete the original polygon we can display it in the background colour before translating it.
Can use other means as well.

Rotation
Here a 2D point P( x, y ) is repositioned to the point P′( x′, y ′) by moving it along a circular
path:

new

P′ old

θ P
yr

xr

The angle of rotation is θ (taken +ve in the counter-clockwise direction) and ( xr , yr ) is a


fixed point of rotation (pivot).
To find the transformation equations we first consider the situation where ( xr , yr ) = (0,0) :

( x′, y′)
r
r ( x, y )
θ
φ
74

From the figure above it is clear that,


x′ = r cos(φ + θ ) = r cos φ cosθ − r sin φ sin θ 
 (5.4)
y ′ = r sin(φ + θ ) = r cos φ sin θ + r sin φ cosθ 

Thus, using the original coordinates,


x = r cos φ , y = r sin φ (5.5)
gives the transformation equations for the point ( x, y ) rotated through an angle θ about
(0,0) to the point ( x′, y ′) as
x′ = x cosθ − y sin θ 
 (5.6)
y ′ = x sin θ + y cosθ 
With the rotation matrix,
cosθ − sin θ 
R=  (5.7)
 sin θ cosθ 
we can write,
 x′   cos θ − sin θ   x 
.  =    , or P′ = R ⋅ P (5.8)
 y ′  sin θ cos θ   y 

For rotation about an arbitrary, but fixed point ( xr , yr ) , i.e. the situation

new

P′( x ′, y ′) old

θ P( x , y )
yr
φ

xr

the transformation equations follow from (5.6) as an easy generalization:

x′ = xr + ( x − xr )cosθ − ( y − yr )sin θ 
 (5.9)
y ′ = yr + ( x − xr )sin θ + ( y − yr )cosθ 
Thus rotation about a general point is effectively a translation (of the pivot point) plus a rotation
about it.
Note:
• Rotations, in general, are rigid body transformations i.e. they move objects without
deformation.
• To rotate straight lines or polygons, rotate the vertices and re-draw.
• Similar processes may be applied for rotating other objects.
75

The following code shows how to implement a general pivot-point rotation of a polygon. The
polygon vertices, the pivot point and the angle of rotation must be supplied as input.

class wcPt2D {
public:
GLfloat x, y;
};

void rotatePolygon (wcPt2D * verts, GLint nVerts, wcPt2D pivPt,


GLdouble theta)
{
wcPt2D * vertsRot;
GLint k;

for (k = 0; k < nVerts; k++) {


vertsRot [k].x = pivPt.x + (verts [k].x - pivPt.x) * cos (theta)
- (verts [k].y - pivPt.y) * sin (theta);
vertsRot [k].y = pivPt.y + (verts [k].x - pivPt.x) * sin (theta)
+ (verts [k].y - pivPt.y) * cos (theta);
}
glBegin {GL_POLYGON};
for (k = 0; k < nVerts; k++)
glVertex2f (vertsRot [k].x, vertsRot [k].y);
glEnd ( );
}

Scaling
A scaling transformation alters the size of an object. Here the new coordinates ( x′, y ′) are
given by
x ′ = s x x, y ′ = s y y (5.10)
where sx , s y > 0 are scaling factors in the X , Y directions respectively. In matrix form,
 x′   sx 0  x  sx 0
 y ′ =  0  , or P′ = S ⋅ P with S =  . (5.11)
  
 
sy   y  0 s y 

Note that
• for values sx , s y < 1 we have a reduction in a particular direction and for values > 1, a
magnification.
• when sx = s y the scaling is said to be uniform, otherwise (for sx ≠ s y ) the scaling is
differential
Examples:
For sx = 2, s y = 1 For sx = s y = 0.5
square → rectangle RH line → LH line
closer to axis and smaller

x′ x
76

Sometimes it is required to scale distances w.r.t. a fixed point, say, ( x f , y f ) . Then the scaled
coordinates ( x′, y ′) are computed from the original ( x, y ) by means of the relations
x′ − x f = s x ( x − x f ), y ′ − y f = s y ( y − y f ) (5.12)
which are usually written as
x′ = sx x + (1 − sx ) x f 
. (5.13)
y ′ = s y y + (1 − s y ) y f 
Observe that terms (1 − s x ) x f and (1 − s y ) y f are constants for all points in the object and
that the matrix form of (5.13) is :

 x′   sx 0   x   (1 − sx ) x f   (1 − sx ) x f 
 =    +  , or P′ = S ⋅ P +  . (5.14)
 y ′  0 s y   y   (1 − s y ) y f   (1 − s y ) y f 

To scale polygons, we apply (5.13) to each vertex and then redraw the line segments. For
circles with uniform scaling, we apply (5.13) to the centre and scale the radius and re-draw. For
other figures, we need to apply (5.13) to each defining point and then re-draw.
Sample code for scaling
class wcPt2D {
public:
GLfloat x, y;
};

void scalePolygon (wcPt2D * verts, GLint nVerts, wcPt2D fixedPt,


GLfloat sx, GLfloat sy)
{
wcPt2D vertsNew;
GLint k;
for (k = 0; k < n; k++) {
vertsNew [k].x = verts [k].x * sx + fixedPt.x * (1 - sx);
vertsNew [k].y = verts [k].y * sy + fixedPt.y * (1 - sy);
}
glBegin {GL_POLYGON};
for (k = 0; k < n; k++)
glVertex2v (vertsNew [k].x, vertsNew [k].y);
glEnd ( );
}

5.2 Matrix representations and homogeneous coordinates


In many applications we require a sequence of geometric transformations to be performed. For
example, in scenes involving animations we may need all of translations, rotations and scaling.
To process such combined operations more efficiently we re-formulate the previous
transformation equations.
Recall from section 5.1 that each basic transformation took the form

P′ = M 1 ⋅ P + M 2 , (5.15)
where, P′, P, M 2 are 2-element column vectors and M 1 is a 2 × 2 matrix.
For a sequence of operations, we may first perform scaling, followed by a translation and then
a rotation, each time calculating newer coordinates from the old. This sequence can be
77

performed in “one go” by a composite matrix multiplication, without the additive term M 2 in
(5.15), if we employ special coordinates known as homogeneous coordinates.

Homogeneous coordinates
First, represent each ( x, y ) with the homogeneous coordinate triple ( xh , yh , h ) where
x y
x = h , y = h , with h ≠ 0 a real parameter (5.16)
h h

Thus the homogeneous coordinate representation of the 2D point ( x, y ) is ( x ⋅ h, y ⋅ h, h ) .


Since any h ≠ 0 can be used as the homogeneous parameter, we conveniently choose
h = 1 (will require other values in 3D), so that:
( x, y ) has as its homogeneous coordinate counterpart ( x, y,1) .

We can now show that all 2D geometric transformations are just matrix multiplications in
homogeneous coordinates.
Thus,
i) for translation we write (and recover (5.2) and (5.3)):

 x′  1 0 t x   x 
 y ′ =  0 1 t   y  (5.17)
   y 

 1  0 0 1   1 
14243
T (t , t )
x y
i.e. we have the form
P′ = T ( t x , t y ) ⋅ P (5.18)
where, P′, P are 3-vectors and T is a 3 × 3 matrix.
Note that:
1 0 − t x 
T (t x , t y )  = 0 1 −t y  (Prove this!).
−1

 
0 0 1 
ii) for rotations, about the origin, we similarly write (and recover (5.6)-(5.8)):

 x′   cosθ − sin θ 0   x 
 y ′ =  sin θ cosθ 0   y  (5.19)
    
 1  144
 0 42444 0 1   1 
3
R(θ )
i.e. we have the form
P′ = R(θ ) ⋅ P (5.20)

Here we can show that (see tutorial)


 cos θ sin θ 0
R(θ ) = R ( −θ ) =  − sin θ
−1
cos θ 0 = R (θ )T (5.21)
 
 0 0 1 
78

iii) for scaling about the origin as fixed point we write and recover (5.11)):

 x′   sx 0 0   x 
 y ′ =  0 s 0   y  (5.22)
   y  
 1   0 0 1   1 
14243
S (s , s )
x y
i.e. we have the form
P′ = S ( s x , s y ) ⋅ P (5.23)

Here we can show that (see tutorial)


1 
s 0 0
 x 
 1 
S ( sx , s y )−1 =  0 0 . (5.24)
sy
 
0 0 1
 
 

Further, to rotate about a general pivot or scale w.r.t. a general fixed point, we can use a
succession of transformations about the origin. However, it’s better to use composite
transformation matrices to effect these.

5.3 Composite transformations in 2D (special cases)


In effecting a sequence of transformations, we shall find that the result is equivalent to matrix
multiplication with a single composite (concatenated) matrix. We thus consider:-
i) Composite translations
Let (t x1 , t y1 ) and (t x 2 , t y 2 ) be 2 successive translation vectors applied to P. Then
the final (in homogeneous coordinates)
P′ = T (t x 2 , t y 2 ) o [T (t x1 , t y1 ) ⋅ P]
(5.25)
=[T (t x 2 , t y 2 ) o T (t x1 , t y1 )] ⋅ P
In matrix form we have

 1 0 t x 2   1 0 t x1   1 0 t x1 + t x 2 
T ( t x 2 , t y 2 ) T ( t x1 , t y1 ) =  0 1 t y 2   0 1 t y 1  = 0 1 t y1 + t y 2 
     (5.26)
0 0 1  0 0 1  0 0 1 
= T (t x1 + t x 2 , t y1 + t y 2 ).
Thus, 2 successive translations are additive (add their matrix arguments for
the translation parts).

ii) Composite rotations


For two successive rotations we similarly obtain,
79

P′ = R(θ 2 ) o [ R(θ1 ) ⋅ P]
(5.27)
=[R(θ 2 ) o R(θ1 )] ⋅ P
which can be verified by multiplying out the matrices giving (verify this)
R(θ 2 ) ⋅ R(θ1 ) = R(θ 2 + θ1 ) (5.28)
or, P′ = R(θ 2 + θ1 ) ⋅ P (5.29)

Thus two successive rotations are additive (add the rotation angles in the
corresponding matrix arguments).

iii) Composite scalings


For two successive scalings we similarly obtain for their matrices,

 sx 2 0 0  s x1 0 0   s x1 ⋅ s x 2 0 0
0 sy2 0 0 s y1 0  =  0 s y1 ⋅ s y 2 0 (5.30)
     
 0 0 1   0 0 1   0 0 1 

or, S ( s x 2 , s y 2 ) ⋅ S ( s x1 , s y1 ) = S ( s x1 ⋅ s x 2 , s y1 ⋅ s y 2 ) . (5.31)

Thus two successive scalings are multiplicative. For example, if we triple the
size of an object twice => final scaling is 9 × original.

5.4 Composite transformations in 2D (general cases)


In effecting a sequence of transformations, we shall find that the result is equivalent to matrix
multiplications.
i) General pivot-point rotation
If we have a function for rotation about the origin we can obtain rotation about any
pivot point ( xr , yr ) by means of the operations:
• Translate object so that ( xr , yr ) → (0,0)
• Rotate about (0,0)
• Translate object so that pivot point (0,0) → ( xr , y r ) , the original pivot

Diagramatically we have:

( xr , yr ) ( xr , y r )

original position translate so rotate about translate so


of pivot pivot = (0,0) (0,0) (0,0) → ( xr , yr )

The composite transformation matrix is then (multiplying in reverse order)


80

1 0 xr   cosθ − sin θ 0   1 0 − xr 
0 1 yr   sin θ cosθ 0  0 1 − y 
    r

0 0 1   0 0 1  0 0 1 
(5.32)
 cosθ − sin θ xr (1 − cosθ ) + yr sin θ 
=  sin θ cosθ yr (1 − cosθ ) − xr sin θ 
 
 0 0 1 

That is we obtain the form


T ( xr , yr ) ⋅ R(θ ) ⋅ T ( − xr , − yr ) ≡ R( xr , yr ,θ ) (5.33)
We can use (5.32) to write a function that accepts a general pivot point ( xr , yr ) .

ii) General fixed point scaling


Similarly, if we have a function that scales only w.r.t. (0,0) , we can scale w.r.t.
any fixed point ( x f , y f ) by means of the steps:
• Translate object so that ( x f , y f ) → (0,0)
• Scale w.r.t. the origin (0,0)
• Inverse translate object so that fixed point (0,0) → ( x f , y f ) , the original
fixed point

Diagramatically we have:

( xr , yr ) ( xr , y r )

original position translate so scale about translate so


of fixed point fixed pt. = (0,0) (0,0) (0,0) → ( x f , y f )

The concatenated matrix is then

1 0 xf   sx 0 0 1 0 − x f   s x 0 x f (1 − sx ) 
0 1 yf  0 sy 0 0 1 − y  =  0 sy y f (1 − s y )  (5.34)
     f   
0 0 1   0 0 1  0 0 1   0 0 1 
or,
T ( x f , y f ) ⋅ S ( sx , s y ) ⋅ T (− x f , − y f ) ≡ S ( x f , y f , s x , s y ) (5.35)

We can use (5.34) to write a function that accepts a general fixed point ( x f , y f ) .
81

iii) Scaling in different directions


Thus far the scaling factors ( sx , s y ) used apply stretching along the usual OX-OY
directions. But, when stretching is required along some other OS1-OS2 directions
with scale factors ( s1 , s2 ) where
Y

S2

O X
θ

S1

we proceed as follows:
• Apply a rotation so that the OS1-OS2 axes coincide the OX-OY axes
• Now scale as before
• Apply reverse rotation to the original directions

The composite matrix is

 s1 cos2 θ + s2 sin 2 θ ( s2 − s1 ) cos θ sin θ 0


 
R(θ ) −1 S ( s1 , s2 ) R(θ ) =  ( s2 − s1 ) cos θ sin θ s1 sin θ + s2 cos θ
2 2
0 (5.36)
 0 0 1 

Example: Using (5.36) with s1 = 1, s2 = 2, θ = 450 sends the unit square to a


parallelogram:

(2,2)
(½,1½)
(0,1) (1,1)

(1½,½)

(0,0) (0,0)
(1,0)

Note that if the scaling above was performed w.r.t. an arbitrary fixed point rather
than the origin, then an additional translation matrix would have to be incorporated
into (5.36).

5.5 Concatenation properties of matrices


Recall the following properties for matrices (A, B, C, ..) whenever the operations are defined:
i) A•B•C = ( A•B) •C = A•( B•C) {associativity w.r.t. •}
ii) A•B ≠ B•A {not commutative in general}
Thus, e.g. translation followed by rotation ≠ rotation followed by translation
but is nevertheless true for
• 2 successive rotations
• 2 successive translations
• 2 successive scalings
82

5.6 General composite transformations and computational efficiency


Summarising the 2D transformations studied above for combinations of translations, rotations
and scalings, we observe that the general composite transformation takes the form:

 x′   rsxx rs xy trsx   x 
 y ′ =  rs rs yy trs y   y  (5.37)
   yx  
 1   0 0 1   1 

Here the terms


rsij = multiplicative rotation • scaling terms involving rotation angle θ and scaling factors
trsx , trs y = combination of translation, pivot point, fixed point, scaling and θ terms.
For example, if an object is scaled, then rotated about its centroid ( xc , yc ) and then translated
the composite matrix is

T (t x , t y ) ⋅ R( xc , yc ,θ ) ⋅ S ( xc , yc , sx , s y )
 sx cosθ − s y sin θ xc (1 − sx cosθ ) + yc s y sin θ + t x 
(5.38)
=  sx sin θ s y cosθ yc (1 − s y cosθ ) − xc sx sin θ + t y 
 
 0 0 1 

We note that although (5.37) ⇒ that after the matrix terms are found, 9 mults. + 6 adds are
needed for each point, after concatenation the new coordinates need only be calculated from
x′ = x ⋅ rsxx + y ⋅ rsxy + trsx 
 (5.39)
y ′ = x ⋅ rs yx + y ⋅ rs yy + trs y 
which actually involves only 4 mults. + 4 adds. In fact, without matrix concatenation, individual
matrix operations on every point would increase the overall operations count considerably.

Special case: Rigid-Body Transformation


For the case of translations and rotations only (“rigid-body” transformations), the coordinates
are changed such that relative distances between points (and angles between lines/vertices) in
the object remain unchanged. The transformation matrix then takes the form

 rxx rxy trx 


r r try  (5.40)
 yx yy 
 0 0 1 
where rij = multiplicative rotation terms and trx , try = translational terms.
Note that the sub-matrix of (5.40) corresponding only to the rotation is, say,

 rxx rxy 
A=  (5.40)
 ryx ryy 

It is easily seen (by consulting (5.38)) that the following properties hold for A:
83

i) AT = A−1 (orthogonality) (5.41)


2 2
 rxx   ryx 
ii) r  = rxx2 + rxy2 = 1 =   (5.42)
 xy   ryy 

 rxx   ryx 
iii)  r  ⋅  r  = rxx ryx + rxy ryy = 0 (5.43)
 xy   yy 

and further, from the above two,

 rxx rxy 0   rxx  1   rxx rxy 0   ryx  0 


iv) r ryy 0   rxy  = 0  , r ryy 0   ryy  = 1  (5.44)
 yx      yx    
 0 0 1   1  1   0 0 1   1  1 

i.e. the vectors ( rxx , rxy ) and ( ryx , ryy ) map to the unit vectors (1,0) and (0,1) along the OX
and OY axes respectively.
All these properties follow from the special case of (5.38) corresponding to a rotation about
some pivot ( xr , yr ) through an angle θ followed by a translation written in the form:

 cosθ − sin θ xr (1 − cosθ ) + yr sin θ + t x 


T (t x , t y ) ⋅ R( xr , yr ,θ ) =  sin θ cosθ yr (1 − cosθ ) − xr sin θ + t y  (5.44)
 
 0 0 1 

Example: For pure rotation about the origin, (5.44) reduces to

 cosθ − sin θ 0
R(θ ) =  sin θ cosθ 0 (5.44a)
 
 0 0 1 

cosθ − sin θ 
Here,   is its orthogonal sub-matrix which maps the unit vectors
 sin θ cosθ 
(cosθ , − sin θ ) and (sin θ ,cosθ ) to the unit vectors (1,0) and (0,1) in the coordinate X and
Y directions respectively (prove this!). We can turn this argument around to determine the
rotation matrix R(θ ) from the unit vectors in the old and new coordinate axes positions as
follows:
y y
v
u
initial position

x θ x
final position
84

Suppose that after rotation thro’ θ the new position is determined by the unit vectors
u = (u x , u y ) and v = ( v x , v y ) . We now claim that the first row of the rotation sub-matrix can
be taken as u = (u x , u y ) and the second row is v = ( v x , v y ) . A proof of this can be obtained
by noting that properties (i) – (iii) above must be satisfied and u ⊥ v . Then if have found
v = ( v x , v y ) , say, we must take u = (ux , u y ) ≡ (v y , − v x ) and then the above properties are
satisfied. Moreover, check that the respective unit vectors here map to the correct (1,0) and
(0,1) of the original coordinate system.

Note that rotations involve calculations of sin θ and cosθ terms, which can be very
expensive in some applications. Many algorithms are thus designed so that a full angle θ of
rotation is achieved incrementally in steps of a small ∆θ . Then we can use one or two terms
of a power series for sin ∆θ and cos ∆θ or simply the approximations,
sin ∆θ 0 
 (OK for ∆θ ≤ 10 ) (5.45)
0

cos ∆θ 1

5.7 A general 2D composite-transformation example program


A C++/OpenGL implementation of a composite scale + rotate + translate sequence is
constructed and applied to transform a triangle as follows:

150 150 AFTER


BEFORE
100 centroid 100 centroid

50 50

50 100 150 50 100 150

Here the triangle is first scaled, then rotated and finally translated. The composite matrix
(compMatrix) is concatenated in that order, starting with compMatrix = identity. The
complete code (from H&B p. 249) is CompositeTransformation2D.cpp and employs OpenGL
only to display the final results whilst the working part is done fully in C++.

5.8 Other transformations in 2D


Reflection
Produces a mirror image of an object by rotating it 180o about an axis of rotation:
For reflection about the X-axis the x-values remain unchanged, but the y-values are flipped.
The path of rotation is ⊥ XY plane for all points in the body.
85

1
original
1 0 0

0 -1 0 (5.46)
 
2 3
X
2′ 3′
reflected
0 0 1
Trans matrix for refl. about y=0
1′

For reflection about the Y-axis the y-values remain unchanged, but the x-values are flipped.
The path of rotation is ⊥ XY plane for all points in the body.
Y

original reflected
 -1 0 0

0 0
2 2′
1 (5.47)
1 1′
 
3 3′ 0 0 1
X Trans matrix for refl. about x=0

For reflection about origin: Both x-values and y-values are flipped. The rotation is about axis
thro’ (0,0) ⊥ XY plane for all points in the body.
Y reflected
3′
 -1 0 0

1′
2′
X 0 -1 0 (5.48)
1
 
2
0 0 1

3 Trans matrix for refl. about (0,0)


original

Note that the above matrix is same as the rotation matrix R(θ ) with θ = 1800 , so both these
operations are equivalent.

For reflection about any other point: Same as rotation about an axis through the fixed reflection
point Pref =(xref, yref) and ⊥ XY plane for all points in the body.
86

Y reflected
3′
What is the transformation matrix
2′ for this case?
1′
Pref
1
2

original
3
X

For reflection about the diagonal line y=x:

Y original

0 
3 y=x 1 0

2
1
3′ 1 0 0 (5.49)
1′
 
2′
0 0 1
reflected
Trans matrix for refl. about y=x
X

Prove that (5.49) is the transformation matrix by concatenating the matrices for:
• clockwise rotation thro’ 450 about (0,0), rotating line y = x to X-axis
• reflection about the X-axis
• rotating X-axis back to the line y = x
Note that this process is also equivalent to reflection about the X-axis + rotation thro’ +900.

For reflection about the diagonal line y = -x:


Y
reflected
2′

3′
0 -1 0

2
1 1′  -1 0 0 (5.50)
X
 
original 3
y=-x
0 0 1
Trans matrix for refl. about y= - x

To derive the above transformation matrix, we:-


• concatenate the identity with matrix for rotation thro ‘ - 450
• with one for reflection about the Y-axis
• with one for counterclockwise rotation thro’ +450.
For reflection about any line y=mx+b:
We employ the combination of translate-rotate-reflect transformations:
• first translate so line passes thro’ (0,0)
87

• rotate line onto one of X or Y axis


• reflect about this axis
• inverse rotate, and
• inverse translate to restore to original line position

Variations
• Reflections about the coordinate axes or (0,0) can also be implemented as scaling with
negative scaling factors
• Can set elements of reflection matrix to be <, >, ±1.
o For > |±1| mirror image is shifted further away
o For < |±1| mirror image is shifted nearer axis

Shear
A shear transformation distorts the shape of an object, causing “internal layers to slide over”.
Two types here are i) a shift in x-values and ii) a shift in y-values.
x-shear:
1 shx 0
x′ = x + shx ⋅ y  0 1 0
Given by  with matrix (5.51)
y′ = y   
0 0 1 

Here shx is any real number. For example shx = 2 changes the square below into a parm:

(1,1) (2,1) (3,1)


(0,1)

(0,0) (0,0)
(1,0) (1,0)

x-shear relative to a reference line y=yref :


1 shx − shx ⋅ yref 
x′ = x + shx ⋅ ( y − yref )   
Given by  with matrix 0 1 0 (5.52)
y′ = y 
 0 0 1 

For example, with shx = 1


2 relative to the line y = yref = −1 we have:-

(1,1) (1,1) (2,1)


(0,1)

(0,0) (0,0)
(1,0) (½,0) (1½,0)

y = -1
88

y-shear relative to a reference line x=xref :

 1 0 0 
x′ = x   sh
Given by  with matrix 1 − shy ⋅ xref  (5.53)
y ′ = y + sh y ( x − xref )   y 
 0 0 1 

This will shift coordinate positions vertically by an amount ∝ distance from the reference line x
= xref, e.g. with sh y = 12 relative to the line x = xref = −1 we have:-

(1,2)
(0,1½)
(1,1)
(0,1) (1,1)

(0,½)

(0,0)
(-1,0) (0,0) (1,0)

x = -1

Remark: Shears may also be expressed as compositions of the basic transformations. For
example, (5.51) may be written as a rotation + a scaling. Can you confirm this?.

5.9 Transformations between coordinate systems


Apart from applying transformations on an object in a particular coordinate system, it is
necessary in many situations (e.g. animation scenes) to be able to transform a description from
one coordinate system to another. Transforming from a Cartesian system to a non-Cartesian
one is complicated and is rarely required in CG. We thus concentrate on transformations
between Cartesian systems, here from XY→ X′Y′ with origin (x0,y0) for the latter:
Y Y′
X′

y0
θ

x0 X

To find the transformation matrix we proceed as follows:


1. Translate so that the origin (x0,y0) of (X′, Y′) → (0,0).
2. Rotate the X′ axis onto the X axis (i.e thro’ −θ ).

For translation we use the matrix:


1 0 − x0 
T ( − x0 , − y0 ) = 0 1 − y0  (5.54)
 
0 0 1 
giving the set up:
89

Y
Y′

y P
X′
y′
x′
θ X
x
Then we do a clockwise rotation via
 cos θ sin θ 0
R( −θ ) =  − sin θ cos θ 0 (5.55)
 
 0 0 1 

The complete transformation matrix from XY→ X′Y′ is


 cosθ sin θ 0   1 0 − x0 
M xyx′y′ ≡ R(−θ ) ⋅ T (− x0 , − y0 ) =  − sin θ cosθ 0  ⋅  0 1 − y0  (5.56)
   
 0 0 1  0 0 1 

This result can also be obtained directly by deriving the relations between the respective
coordinate-distance pairs in the two systems from the figure above.

As an alternative to giving the orientation of X′Y′ relative to XY as an angle is to use unit


vectors in the Y′ and X′ directions:
Y Y′
X′
V
y0
P0

x0 X

Thus, suppose V is a point vector in the XY system and is in the same direction as the +ve Y′
coordinate axis. Then if a unit vector along the +ve Y′ coordinate axis is, say,
V
v= ≡ (vx , v y ) ) (5.57)
V
we take the unit vector along the +ve X′ coordinate axis as
u = (ux , u y ) ≡ ( v y , − v x ) (5.58)

so that the rotation matrix is (see section following equation (5.44) for the reason):

u x uy 0
R =  vx vy 0 (5.59)
 
 0 0 1 

As an example if v = (−1,0) i.e. Y ′ ~ − X axis and X ′ ~ +Y axis then


90

 0 1 0
R =  −1 0 0  . (5.60)
 
 0 0 1 

This result also follows from (5.55) by setting θ = 900 .

Typically, in interactive applications, it is more convenient to choose a direction V relative to a


position P0 which is the origin of the X′Y′ system rather than relative to the XY origin:
Y′
Y
P1
X′
V
y0
P0

x0 X

P1 − P0 V
Then we can use the unit vector = = v ≡ (vx , v y ) (5.61)
P1 − P0 V
with u = (ux , u y ) ≡ ( v y , − v x ) (5.62)

5.10 Raster methods for geometric transformations


On raster systems faster alternate methods can be used. Since pictures are held as pixel
patterns in a frame buffer, we can perform simple transforms by moving rectangular arrays of
pixel values from one location to another with little or no arithmetic operations. Functions for
doing this are known as raster operators. These can do bit block transfers or bitBlts on a bi-
level system or pixel block transfers (pixBlts) on a multi-level system.
For example, the following indicates how an entire block of pixels containing an object may be
translated:
Pmax

entire
block
moved
Pmin
to new
position

P0
before after

This process may be achieved with the following steps:-


• read pixel intensities from rectangle area of a raster buffer into an array.
• copy this array into the raster at the new location (ref. pt. P0).
• erase original object by filling rectangle area with the background intensity (if object
does not overlap other objects in scene).
For rotations in 900 increments we
91

• take the pixel array ~ the block and reverse each row
• then interchange the rows and columns
e.g.
1 2 3 3 2 1
 4 5 6   6 5 4   3 6 9 12 
 →  →  2 5 8 11 ~ 900 rotation
7 8 9 9 8 7  
      1 4 7 10 
10 11 12  12 11 10 

For other angles more processing is required.

For scalings and reflections similar methods (or combinations of the above) can be devised.
See H&B p.257.

OpenGL raster op functions


These are (some already discussed in Ch 3):
i) glCopyPixels(xmin,ymin,width,height,GL_COLOR);
Here the 1st 4 params = location and size of pixel block and GL_COLOR => colour
values to be copied: if current color mode set RGBA then such values copied.
When doing translations (or rotations) the source buffer and destination buffers
can be any of the available buffers selected with glReadBuffer(..) or
glDrawBuffer(..) respectively, chosen before this call.
The source and destination block dimensions must be within range of the screen
coordinates.

ii) glReadPixels(xmin,ymin,width,height,GL_RGB,GL_UNSIGNED_BYTE,
colorArray);
Allows for the saving of a pixel block in array colorArray. If color-table indices are
stored at pixel positions, then replace GL_RGB with GL_COLOR_INDEX. To
rotate the pixel block we re-arrange the rows and columns of colorArray.

iii) glDrawPixels(width,height,GL_RGB,GL_UNSIGNED_BYTE,colorArray);
Used to put back the rotated array into the buffer, with lower-left corner at current
selected raster position.
The source buffer must have been selected with glReadBuffer(..) and the
destination buffer with glDrawBuffer(..).

iv) glPixelZoom(sx,sy);
Used to do 2D scaling with scaling factors sx and sy. These are floating-point
values: 0...< 1.0 => decrease size, > 1.0 => increase, < 0.0 => reflection of source
w.r.t. current raster position.
First call above and then follow with glCopyPixels(..) or glDrawPixels(..).

Raster ops may also be combined with logical operations (and, or, not, xor) in OpenGL.

5.11 3-D geometric transformations

5.11.1 Translation
92

As before we employ homogeneous coordinates (x,y,z,1) to represent (x,y,z) => translation


from P( x, y , z,1) → P′( x′, y ′, z′,1) is given by
 x′  1 0 0 t x   x 
 y ′ 0 1 0 t   y 
 = y 
or P′ = T ⋅ P (5.63)
 z ′  0 0 1 t z   z 
    
 1  140 4 0 0 1  1 
244 3
T

where t x , t y , t z = translation displacements along each of X,Y,Z axes respectively. Note that
(5.63) expands to the usual component forms:

x′ = x + t x , y′ = y + t y , z′ = z + tz . (5.64)

Y
T = (t x ,t y , t z )
P′ = ( x′, y ′, z′)

P = ( x, y, z )
X

To translate a 3D object, we translate each of its defining points and then re-draw it. For
example, for polygons we translate each vertex and re-draw.
As before the inverse translation is given by the operator T = ( −t x , − t y , −t z ) with its
corresponding matrix similarly amended.

A code fragment for constructing a 4 × 4 translation matrix is:

typedef GLfloat Matrix4x4 [4][4];

/* Construct the 4 by 4 identity matrix. */


void matrix4x4SetIdentity (Matrix4x4 matIdent4x4)
{
GLint row, col;
for (row = 0; row < 4; row++)
for (col = 0; col < 4 ; col++)
matIdent4x4 [row][col] = (row == col);
}

void translate3D (GLfloat tx, GLfloat ty, GLfloat tz)


{
Matrix4x4 matTransl3D;
/* Initialize translation matrix to identity. */
matrix4x4SetIdentity (matTransl3D);

matTransl3D [0][3] = tx;


matTransl3D [1][3] = ty;
93

matTransl3D [2][3] = tz;


}

5.11.2 Rotation
Here we need to specify an axis of rotation which can be any 3D line in space and the angle of
rotation. By convention we take the counter-clockwise direction as positive when looking from
far on the X (or Y or Z or line) axis → origin O:
Y

θ = + ve observer
O X

Before considering an arbitrary 3D axis of rotation we first consider rotations about the
coordinate axes.

Y Z X

θ X θ Y θ Z

Z X Y

For rotation thro’ θ about the Z-axis recall from the 2D case, the equations:
x′ = x cosθ − y sin θ 

y ′ = x sin θ + y cosθ  (5.65)
z′ = z 

This easily is extended to 3D homogeneous coordinates by writing

 x′   cosθ − sin θ 0 0   x 
 y ′  sin θ cosθ 0 0   y 
 =    or P′ = Rz (θ ) ⋅ P (5.66)
 z′   0 0 1 0  z 
    
 1  1444
 0 0
424444
0 1  1 
3
R (θ )
z

For the transformation equations for rotations about the X-axis and Y-axes respectively, we
can similarly obtain them by cyclic permutation of the coordinates ( x → y → z → x ) in (5.66)
and consultation of the figures above as follows:-

For rotation thro’ θ about the X-axis :


94

y ′ = y cosθ − z sin θ 

z′ = y sin θ + z cosθ  (5.67)
x′ = x 

i.e.
 x′  1 0 0 0  x 
 y ′ 0 cosθ − sin θ 0   y 
 =   . or P′ = Rx (θ ) ⋅ P (5.68)
 z′  0 sin θ cosθ 0   z 
    
 1  1444
0 0
424444
0 1 1
3  
R (θ )
x

For rotation thro’ θ about the Y-axis :


z′ = z cosθ − x sin θ 

x′ = z sin θ + x cosθ  (5.69)
y′ = y 

i.e.
 x′   cosθ 0 sin θ 0   x 
 y ′  0 1 0 0  y 
 =   . or P′ = Ry (θ ) ⋅ P (5.70)
 z′   − sin θ 0 cosθ 0   z 
    
 1  1444
 0 4024444 0 1 1
3  
R (θ )
y

Remark:
For inverse rotations, as in 2D, we replace θ by −θ in the above. These then correspond to
clockwise rotations. Since the only changes are sin θ → − sin θ and vice versa, we find that
the inverse matrix is the same as the original but with the rows and columns interchanged, i.e.
in all the above,
R −1 = R T . (orthogonality property) (5.71)

General 3D rotations
For rotation about any other axis, we can set up the transformation matrix as a composition of
rotations about the coordinate axes.
i) As a special case for rotation about an axis || a coordinate axis we follow the
steps:
1. Translate object so rotation axis coincides the || coordinate axis
2. Do rotation about that coordinate axis
3. Translate so that rotation axis → original rotation axis position

Example:
Suppose the axis of rotation || X-axis, then we have:
95

Y Y Y Y
rotate translate

rotation axis
X X
X X

Z Z Z Z
ORIGINAL translate FINAL

Thus, the composite operation is given by


P′ = T −1 ⋅ Rx (θ ) ⋅ T ⋅ P (5.72)
resulting in the composite rotation matrix,
R(θ ) = T −1 ⋅ Rx (θ ) ⋅ T (5.73)

ii) In the general case for rotation about an axis not || a coordinate axis we
follow the steps:
1. Translate object so rotation axis passes through (0,0,0)
2. Rotate object so rotation axis is || (i.e. coincides) one of the coordinate axes
3. Perform rotation about this coordinate axis
4. Apply inverse rotation to bring axis to original orientation but thro’ (0,0,0)
5. Apply inverse translation on object so axis returns to original position

This sequence of operations may be applied with any one of the coordinate axes. For
convenience let’s choose the Z-axis.
First, we can specify a rotation axis by 2 coordinate position vectors (P1 , P2 ) or one
position vector and direction angles (or direction cosines). Consider the case of 2 points
and a counter-clockwise angle, when looking from P2 to P1 :

Y Y Y
P2 P2
θ u u
P1 P1
X X X

Z Z Z

Call the rotation axis vector


V = P2 - P1 = ( x2 − x1 , y2 − y1 , z2 − z1 ) (5.74)
Then a unit vector along it is
V  x − x y − y1 z2 − z1 
u= ≡ ( a, b, c ) ≡  2 1 , 2 ,  (5.75)
V  V V V 
Now, to reposition the axis so that it passes through (0,0,0) we translate the point P1 to it
so that u is positioned as in the 3rd figure above. When viewing from P2 to P1 , the rotation
96

is –ve (clockwise) we reverse the V and u vectors (i.e. they point in the P2 to P1
direction); otherwise take θ as a negative angle.
Now:
1. We move P1 to (0,0,0) via
1 0 0 − x1 
0 1 0 − y 
T = 1
(5.76)
 0 0 0 − z1 
 
0 0 0 1 
2. Next put u onto the Z-axis. We use the rotations
• u → XZ plane giving u′′ by rotating around the X-axis thro’ some angle α
• u′′ → Z -axis giving u z = (0,0,1) by rotating around the Y-axis thro’ some
angle β

Y Y Y Y

u u′ u
β β
α α
X X X X
u′′ u′′ = (a ,0, d )
u′′ u z = (0,0,1) u z = (0,0,1)
u z = (0,0,1)
Z Z Z Z

To find the angles α and β and hence the rotation matrices we proceed as follows:
Let u′ = (0, b, c ) be the projection of u onto the YZ plane. Then α is the angle between u′
and the Z-axis.
u′⋅ u z c
Now, cos α = = , where d = b 2 + c 2 (5.77)
u′ u z d
Also, u′ × u z = ux u′ u z sin α , where u x = (1,0, 0) (5.78)

Further the Cartesian form of this cross-product is


u′ × u z = (0, b, c ) × (0,0,1) = b ⋅ u x (5.79)
Thus from (5.77) and (5.78) with u′ = d , u z = 1 we obtain
b
d sin α = b, or sin α = (5.80)
d
Now that we have sin α and cos α in terms of u the matrix for rotating u about OX is

1 0 0 0
 c −b 
0 0
Rx (α ) =  
d d
(5.81)
 b c 
0 0
 d d 
 0 0 0 1 

Suppose that after this rotation the result is u′′ . Then its
97

• x-component = a since rotation about OX leaves the x-component invariant


• y-component = 0 since it lies in the XZ plane
• z-component = d since d = u′ = (0, b, c ) and u′ is rotated onto OZ
i.e. u′′ = ( a, 0, d ) (5.82)
′′
u ⋅ uz
Then, cos β = = d since u′′ = u z = 1 (5.83)
u′′ u z
Comparing u′′ × u z = u y u′′ u z sin β , where u y = (0,1,0) (5.84)
with the Cartesian form of this cross-product
u′′ × u z = (a ,0, d ) × (0, 0,1) = −a ⋅ u y (5.85)
we obtain sin β = − a (5.86)
Thus the transformation matrix for rotating u′′ about OY is

d 0 −a 0
0 1 0 0
Ry (β ) =   (5.87)
a 0 d 0
 
0 0 0 1

3. Now after applying (5.76), (5.81) and (5.87) we have aligned the axis of rotation along
the +ve Z-axis.
4. Next we rotate through the specified angle θ about the Z-axis via the transformation
matrix
 cos θ − sin θ 0 0 
 sin θ cos θ 0 0 
Rz (θ ) =   (5.88)
 0 0 1 0
 
 0 0 0 1

5. To complete the transformation, we must transform the final rotation axis back to the
original position by applying inverses to the above transformations, giving the final
transformation matrix as

R(θ ) = T −1 ⋅ Rx−1 (α ) ⋅ R y−1 ( β ) ⋅ Rz (θ ) ⋅ R y ( β ) ⋅ Rx (α ) ⋅ T (5.89)

It is also possible to formulate this composite transformation by considering vectors in a local


coordinate system. See H&B p. 271.

5.11.3 Quaternions and rotation


A mathematically more elegant (and more efficient) way to handle rotations is to make use of
quaternions.
A quaternion q may be regarded as an extension of a complex number to higher dimensions
and is defined as:
q = s{ + ia + jb + kc (5.90)
real part
14243
imaginary part

where s, a , b, c are real numbers and i , j , k satisfy the relations


98

i 2 = j 2 = k 2 = −1, ij = − ji = k , jk = −kj = i, ki = −ik = j (5.91)

and with the following algebraic operations defined on the set of such numbers:
i) scalar multiplication –
uq = us + i (ua ) + j (ub) + k (uc ) for any real (scalar) u
ii) quaternion addition (“+”) is associative and is given by, for quats q1 , q2 –
q1 + q2 = ( s1 + s2 ) + i (a1 + a2 ) + j (b1 + b2 ) + k ( c1 + c2 )
iii) quaternion multiplication (“.”) is associative and is given by –
q1 ⋅ q2 = ( s1 + ia1 + jb1 + kc1 ) ⋅ ( s2 + ia2 + jb2 + kc2 )
= expansion with (5.91)

With the ordered pair notation


q = ( s, v ); v = (a , b, c ) (5.92)
we can show: q1 + q2 = ( s1 + s2 , v1 + v 2 ) (5.93)
q1 ⋅ q2 = ( s1 ⋅ s2 − v1 ⋅ v 2 , s1v 2 + s2 v1 + v1 × v 2 ) (5.94)
2
q = s2 + v ⋅ v (5.95)
1
q −1 = 2
( s, − v )
q
(5.96)
q ⋅ q −1 = q −1 ⋅ q = (1, 0) (5.97)

Rotating with quaternions


To perform a rotation about an axis thro’ the origin, with unit vector u along the axis and angle
of rotation θ , we set up the quaternion q = ( s, v ) where
θ θ
s = cos , v = u sin . (5.98)
2 2
Now if p = ( x, y , z ) is any point to be rotated we use the quaternion
P = (0, p); p = ( x, y , z ) (5.99)
and rotate by P′ = qPq −1 (5.100)
where P′ = (0, p′) (5.101)
and p′ = s 2 p + v (p ⋅ v ) + is( v × p) + v × (v × p) (5.102)

The operation (5.100) can be shown to be equivalent to the rotation part operation of (5.89),
with the matirx
Rx−1 (α ) ⋅ Ry−1 ( β ) ⋅ Rz (θ ) ⋅ R y ( β ) ⋅ Rx (α ) . (5.103)
As an outline of the process to be followed, write the vector part of q as v = ( a, b, c ) and
expand (5.102) to a 3 × 3 matrix equation involving p = ( x, y , z ) and p′ = ( x′, y ′, z′) , namely
p′ = M R (θ ) ⋅ p
where,
1 − 2b2 − 2c 2 2ab − 2 sc 2ac + 2 sb 
 
M R (θ ) =  2ab + 2 sc 1 − 2a − 2c
2 2
2bc − 2 sa  (5.104)
 2ac − 2 sb 2bc + 2 sa 1 − 2a − 2b 
2 2

99

With values for a,b,c from (5.98), u = (u x , u y , u z ) and some trigonometric identities we find
M R (θ ) =
 u x2 (1 − cosθ ) + cosθ u x u y (1 − cosθ ) − uz sin θ u x uz (1 − cosθ ) + u y sin θ 
  (5.105)
u y u x (1 − cosθ ) + uz sin θ u (1 − cosθ ) + cosθ u y uz (1 − cosθ ) − u x sin θ 
2
y
u z u x (1 − cosθ ) − u y sin θ u z u y (1 − cosθ ) + u x sin θ uz2 (1 − cosθ ) + cosθ 

Finally, when the form (5.103) is fully expanded it can be shown to be just the matrix (5.105).
For more detail consult H&B p. 273.
The complete rotation about an arbitrary axis must then include translation as before to give the
form
R(θ ) = T −1 ⋅ M R (θ ) ⋅ T (5.106)
which corresponds to (5.89).

The following code shows how to construct a 3D rotation matrix (H&B p. 275).

class wcPt3D {
public:
GLfloat x, y, z;
};
typedef float Matrix4x4 [4][4];
Matrix4x4 matRot;
/* Construct the 4 by 4 identity matrix. */
void matrix4x4SetIdentity (Matrix4x4 matIdent4x4)
{
GLint row, col;
for (row = 0; row < 4; row++)
for (col = 0; col < 4 ; col++)
matIdent4x4 [row][col] = (row == col);
}
/* Premultiply matrix m1 times matrix m2, store result in m2. */
void matrix4x4PreMultiply (Matrix4x4 m1, Matrix4x4 m2)
{
GLint row, col;
Matrix4x4 matTemp;
for (row = 0; row < 4; row++)
for (col = 0; col < 4 ; col++)
matTemp [row][col] = m1 [row][0] * m2 [0][col] + m1 [row][1] *
m2 [1][col] + m1 [row][2] * m2 [2][col] +
m1 [row][3] * m2 [3][col];
for (row = 0; row < 4; row++)
for (col = 0; col < 4; col++)
m2 [row][col] = matTemp [row][col];
}

void translate3D (GLfloat tx, GLfloat ty, GLfloat tz)


{
Matrix4x4 matTransl3D;
/* Initialize translation matrix to identity. */
matrix4x4SetIdentity (matTransl3D);
matTransl3D [0][3] = tx;
matTransl3D [1][3] = ty;
matTransl3D [2][3] = tz;
/* Concatenate translation matrix with matRot. */
matrix4x4PreMultiply (matTransl3D, matRot);
}
100

void rotate3D (wcPt3D p1, wcPt3D p2, GLfloat radianAngle)


{
Matrix4x4 matQuaternionRot;
GLfloat axisVectLength = sqrt ((p2.x - p1.x) * (p2.x - p1.x) +
(p2.y - p1.y) * (p2.y - p1.y) +
(p2.z - p1.z) * (p2.z - p1.z));
GLfloat cosA = cos (radianAngle);
GLfloat oneC = 1 - cosA;
GLfloat sinA = sin (radianAngle);
GLfloat ux = (p2.x - p1.x) / axisVectLength;
GLfloat uy = (p2.y - p1.y) / axisVectLength;
GLfloat uz = (p2.z - p1.z) / axisVectLength;
/* Set up translation matrix for moving p1 to origin. */
translate3D (-p1.x, -p1.y, -p1.z);
/* Initialize matQuaternionRot to identity matrix. */
matrix4x4SetIdentity (matQuaternionRot);
matQuaternionRot [0][0] = ux*ux*oneC + cosA;
matQuaternionRot [0][1] = ux*uy*oneC - uz*sinA;
matQuaternionRot [0][2] = ux*uz*oneC + uy*sinA;
matQuaternionRot [1][0] = uy*ux*oneC + uz*sinA;
matQuaternionRot [1][1] = uy*uy*oneC + cosA;
matQuaternionRot [1][2] = uy*uz*oneC - ux*sinA;
matQuaternionRot [2][0] = uz*ux*oneC - uy*sinA;
matQuaternionRot [2][1] = uz*uy*oneC + ux*sinA;
matQuaternionRot [2][2] = uz*uz*oneC + cosA;
/* Combine matQuaternionRot with translation matrix. */
matrix4x4PreMultiply (matQuaternionRot, matRot);
/* Set up inverse matTransl3D and concatenate with
* product of previous two matrices.
*/
translate3D (p1.x, p1.y, p1.z);
}

void displayFcn (void)


{
/* Input rotation parameters. */
/* Initialize matRot to identity matrix: */
matrix4x4SetIdentity (matRot);
/* Pass rotation parameters to procedure rotate3D. */
/* Display rotated object. */
}

5.11.4 3D scaling
To scale p = ( x, y , z ) relative to the coordinate origin we use,
 x′   sx 0 0 0   x  x
 y ′  0 s 0 0 y    y
P′ ≡   =  y    or, P′ = S ⋅ P ≡ S ⋅   (5.107)
 z′   0 0 sz 0  z  z
      
 1  1442443
 0 0 0 1  1  1
S
which expands to x ′ = s x ⋅ x , y ′ = s y ⋅ y , z ′ = sz ⋅ z . (5.108)

Note that scaling changes the size of an object as well its position relative to the origin.
As before, to preserve the original shape we must have sx = s y = sz .
101

In order to scale w.r.t. an arbitrary fixed point ( x f , y f , z f ) we proceed as in 2D:


1. Translate ( x f , y f , z f ) to the origin (0,0,0).
2. Scale relative to (0,0,0) via (5.107/8).
3. Translate so that (0,0,0) returns to the original position ( x f , y f , z f ) .
Thus the scaling matrix is a concatenation of the above 3:

T ( x f , y f , z f ) ⋅ S ( s x , s y , sz ) ⋅ T ( − x f , − y f , − z f )
 sx 0 0 (1 − sx ) x f 
0 sy 0 (1 − s y ) y f  (5.109)
= .
0 0 sz (1 − sz ) z f 
 
0 0 0 1 

Inverse scaling to (5.109) is obtained by the replacements


1 1 1
s x → , s y → , sz → (undefined if any factor = 0!).
sx sy sz

The following code shows how to construct a 3D scaling matrix (H&B p. 277).

class wcPt3D
{
private:
GLfloat x, y, z;
public:
/* Default Constructor:
* Initialize position as (0.0, 0.0, 0.0).
*/
wcPt3D ( ) {
x = y = z = 0.0;
}

setCoords (GLfloat xCoord, GLfloat yCoord, GLfloat zCoord) {


x = xCoord;
y = yCoord;
z = zCoord;
}

GLfloat getx ( ) const {


return x;
}

GLfloat gety ( ) const {


return y;
}

GLfloat getz ( ) const {


return z;
}
};

typedef float Matrix4x4 [4][4];

void scale3D (GLfloat sx, GLfloat sy, GLfloat sz, wcPt3D fixedPt)
{
102

Matrix4x4 matScale3D;
/* Initialize scaling matrix to identity. */
matrix4x4SetIdentity (matScale3D);
matScale3D [0][0] = sx;
matScale3D [0][3] = (1 - sx) * fixedPt.getx ( );
matScale3D [1][1] = sy;
matScale3D [1][3] = (1 - sy) * fixedPt.gety ( );
matScale3D [2][2] = sz;
matScale3D [2][3] = (1 - sz) * fixedPt.getz ( );
}

5.11.5 Composite 3D transformation


As before a composite transformation may be built up by concatenating the matrices for the
individual transformations. As an example, the following code constructs a rotate + scale +
translate composite matrix in the left-to-right multiplication order i.e. starting with the identity
matrix on the left, we concatenate with the rotation (on the right), the result is then
concatenated by the scaling matrix (on the right) and finally with the translation matrix.

class wcPt3D {
public:
GLfloat x, y, z;
};
typedef GLfloat Matrix4x4 [4][4];
Matrix4x4 matComposite;
/* Construct the 4 by 4 identity matrix. */
void matrix4x4SetIdentity (Matrix4x4 matIdent4x4)
{
GLint row, col;
for (row = 0; row < 4; row++)
for (col = 0; col < 4 ; col++)
matIdent4x4 [row][col] = (row == col);
}

/* Premultiply matrix m1 times matrix m2, store result in m2. */


void matrix4x4PreMultiply (Matrix4x4 m1, Matrix4x4 m2)
{
GLint row, col;
Matrix4x4 matTemp;
for (row = 0; row < 4; row++)
for (col = 0; col < 4 ; col++)
matTemp [row][col] = m1 [row][0] * m2 [0][col] + m1 [row][1] *
m2 [1][col] + m1 [row][2] * m2 [2][col] +
m1 [row][3] * m2 [3][col];
for (row = 0; row < 4; row++)
for (col = 0; col < 4; col++)
m2 [row][col] = matTemp [row][col];
}

/* Procedure for generating 3D translation matrix. */


void translate3D (GLfloat tx, GLfloat ty, GLfloat tz)
{
Matrix4x4 matTransl3D;
/* Initialize translation matrix to identity. */
matrix4x4SetIdentity (matTransl3D);

matTransl3D [0][3] = tx;


matTransl3D [1][3] = ty;
matTransl3D [2][3] = tz;
103

/* Concatenate matTransl3D with composite matrix. */


matrix4x4PreMultiply (matTransl3D, matComposite);
}

/* Procedure for generating a quaternion rotation matrix. */


void rotate3D (wcPt3D p1, wcPt3D p2, GLfloat radianAngle)
{
Matrix4x4 matQuatRot;
float axisVectLength = sqrt ((p2.x - p1.x) * (p2.x - p1.x) +
(p2.y - p1.y) * (p2.y - p1.y) +
(p2.z - p1.z) * (p2.z - p1.z));
float cosA = cosf (radianAngle);
float oneC = 1 - cosA;
float sinA = sinf (radianAngle);
float ux = (p2.x - p1.x) / axisVectLength;
float uy = (p2.y - p1.y) / axisVectLength;
float uz = (p2.z - p1.z) / axisVectLength;
/* Set up translation matrix for moving p1 to origin,
* and concatenate translation matrix with matComposite.
*/
translate3D (-p1.x, -p1.y, -p1.z);
/* Initialize matQuatRot to identity matrix. */
matrix4x4SetIdentity (matQuatRot);
matQuatRot [0][0] = ux*ux*oneC + cosA;
matQuatRot [0][1] = ux*uy*oneC - uz*sinA;
matQuatRot [0][2] = ux*uz*oneC + uy*sinA;
matQuatRot [1][0] = uy*ux*oneC + uz*sinA;
matQuatRot [1][1] = uy*uy*oneC + cosA;
matQuatRot [1][2] = uy*uz*oneC - ux*sinA;
matQuatRot [2][0] = uz*ux*oneC - uy*sinA;
matQuatRot [2][1] = uz*uy*oneC + ux*sinA;
matQuatRot [2][2] = uz*uz*oneC + cosA;
/* Concatenate matQuatRot with composite matrix. */
matrix4x4PreMultiply (matQuatRot, matComposite);
/* Construct inverse translation matrix for p1 and
* concatenate with composite matrix.
*/
translate3D (p1.x, p1.y, p1.z);
}

/* Procedure for generating a 3D scaling matrix. */


void scale3D (Gfloat sx, GLfloat sy, GLfloat sz, wcPt3D fixedPt)
{
Matrix4x4 matScale3D;
/* Initialize scaling matrix to identity. */
matrix4x4SetIdentity (matScale3D);
matScale3D [0][0] = sx;
matScale3D [0][3] = (1 - sx) * fixedPt.x;
matScale3D [1][1] = sy;
matScale3D [1][3] = (1 - sy) * fixedPt.y;
matScale3D [2][2] = sz;
matScale3D [2][3] = (1 - sz) * fixedPt.z;
/* Concatenate matScale3D with composite matrix. */
matrix4x4PreMultiply (matScale3D, matComposite);
}

void displayFcn (void)


{
/* Input object description. */
/* Input translation, rotation, and scaling parameters. */
104

/* Set up 3D viewing-transformation routines. */


/* Initialize matComposite to identity matrix: */
matrix4x4SetIdentity (matComposite);
/* Invoke transformation routines in the order they
* are to be applied:
*/
rotate3D (p1, p2, radianAngle); // First transformation: Rotate.
scale3D (sx, sy, sz, fixedPt); // Second transformation: Scale.
translate3D (tx, ty, tz); // Final transformation: Translate.
/* Call routines for displaying transformed objects. */
}

5.11.6 Other 3D transformations


Reflections
A 3D reflection is done relative to a reflection axis or a reflection plane. Similarly to the 2D
case:-
• reflections relative to an axis = 1800 rotation about the axis
• reflections relative to a plane = 1800 rotation in 4D space
• for reflection plane = XY or XZ or YZ the coordinate system changes from a RH to LH
one

Example: The following reflection about the XY plane results in the RH → LH system (or vice
versa).

y y 1 0 0
0
z
0 1 0 0
x x M zrefl =  (5.110)
0 0 −1 0 
z  
0 0 0 1

The above effects an inversion of z-values with x = x, y = y, z = - z.


Similarly, to invert x-values we reflect about the YZ plane and for y-values about XZ. For
reflections about any other plane, we employ a combination of rotations, translations and these
reflections as in the 2D case.

Shears
A 3D shear is used to modify shapes, for example a Z-axis shear relative to a reference
position zref is effected by

1 0 shzx − shzx ⋅ zref 


0 1 shzy − shzy ⋅ zref 
M zshear =  ; shzx , shzy = real numbers (5.111)
0 0 1 0 
 
0 0 0 1 

The effect of this transformation is to change the x and y values by an amount proportional to
the distance from the plane z = zref .

5.12 Transforming beween 3D coordinate systems


105

Recall that in the 2D case when we wanted to transform a scene from one Cartesian system to
another, we obtained the required transformation matrix by concatenating the matrices required
to back-transfer the new coordinate to the old position.

y y′
u′y x′
( x0 , y0 , z 0 ) u′x
(0,0,0) x
u′z
z
z′

Thus, if XYZ is one system with origin (0,0,0) and X ′Y ′Z ′ is the other where the latter has
origin ( x0 , y0 , z0 ) relative to (0,0,0) and in addition unit vectors u′x , u′y , u′z along its coordinate
axes, then the rotation matrix
 u′x1 u′x 2 u′x 3 0 
u ′ u ′ u ′ 0 
R =  y1 y2 y3  (5.112)
′ ′ ′
 u z1 u z 2 u z 3 0 
 
0 0 0 1
will transform the vectors u′x , u′y , u′z onto the X, Y and Z axes directions respectively. The final
transformation is obtained by translating ( x0 , y0 , z0 ) to (0,0,0) so that the composite matrix is

 u′x1 u′x 2 u′x 3 0  1 0 0 − x0 


u′ u′ u′y 3 0  0 1 0 − y0 
R ⋅ T =  y1 y2 ⋅  (5.113)
 u′z1 uz′ 2 u′z 3 0  0 0 1 − z0 
   
0 0 0 1  0 0 0 1 

5.13 Affine transformations


Recall that in all the transformations studied so far (translation, rotation, scaling, reflection and
shear) we obtained the new coordinates ( x′, y ′, z′) as linear combinations of the old
coordinates ( x, y, z ) through equations of the form
x′ = a xx x + a xy y + a xz z + bx 

y ′ = a yx x + a yy y + a yz z + by  (5.114)
z′ = a zx x + a zy y + a zz z + bz 
where the a ' s and b ' s are constants.
We call such a transformation an affine transformation. Their properties include:-
• transforming parallel lines to parallel lines
• transfoming finite points to finite points

The special affine transformation, involving only translation, rotation and reflection also
• preserves angles
• preserves lengths of lines
in addition to the above properties.
106

5.14 OpenGL transformation and related-operation functions


OpenGL supplies a separate function for each basic geometric transformation. These are
designed for 3D manipulations but set up a 4×4 transformation matrix as in the theory above.
For perfoming geometric transformations and related operations, the important functions are:-

1. Translation: glTranslate*(tx, ty, tz) – sets up a 4×4 translation matrix


Operates on 4-element column vector for a point. For 2D also on 4-element vector but
put tz = 0.0 and z-component = 0.0. Is applied to positions in an object defined after
this call.
Example: glTranslatef(30.5, -5.0, 0.0);

2. Rotation: glRotate*(theta, vx, vy, vz) – sets up a 4×4 rotation matrix


Uses quaternion form (5.105). The float vector v = (vx, vy, zx) gives the orientation of
the rotation axis through the origin and theta = rotation angle in degrees. Is applied to
positions in an object defined after this call.
Example: glRotatef(90.0, 0.0, 0.0, 1.0); //does 900 rotation about z-axis.

3. Scaling: glScale*(sx, sy, sz) – sets up a 4×4 scaling matrix


Scaling factors sx, sy, sz can be +ve or-ve but not zero. For –ve values reflections are
produced. Is applied to positions in an object defined after this call.
Example: glScalef(2.0, -3.0, 1.0); //stretches by factor 2 in x, by factor 3 in y
//and reflects about x-axis.

4. MatrixMode: glMatrixMode(mode) – sets up a 4×4 matrix for each of various


modes possible:-
mode = GL_PROJECTION sets up a matrix used for projection transformation –
determines how a scene is projected onto the screen
mode = GL_MODELVIEW sets up viewing matrix to concatenate basic geometric
transformations – this sets up a 4×4 current matrix which gets modified as each basic
transformation is performed and is then used to operate on all coordinate positions in a
scene. The model view mode is the default.
mode = others (texture mode – for texture mapping matrix, color mode – for colour
model conversion) – see later

5. Modifying current mode matrices:


a. glLoadIdentity( ); //assigns identity matrix to current matrix of a mode.
b. glLoadMatrix*(elements16); //assigns 16 values in column-major order from
//single-subscripted array elements16 to current matrix
Example code segment:
glMatrixMode(GL_MODELVIEW);
GLfloat elems[16];
GLint k;
for (k=0; k < 16; k++)
elems[k] = float(k);
glLoadMatrixf(elems);

will produce the current matrix


107

0.0 4.0 12.0 


8.0
1.0 5.0 9.0 13.0 
M = .
 2.0 6.0 10.0 14.0 
 
3.0 7.0 11.0 15.0 

c. glMultMatrix*(otherElements16);
//post-multiplies current matrix by matrix formed into column-major form from
//16-element array otherElements16.
Example: The code segment

glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glMultMatrix(elemsM2);
glMultmatrix(elemsM1);

will produce the following current modelview matrix


M = I ⋅ M 2 ⋅ M1
where I is the 4×4 identity, M 2 is formed from elemsM2 and M 1 is formed
from elemsM1.
Note that in OpenGL
• the first transformation applied in the sequence will be last one
specified (i.e. M 1 first, then M 2 , then I which does nothing). This is
like forming a matrix stack and popping them off when applying each
operation.
• since matrix elements are stored in column-major order, an element
such as m jk is the element in the j-th column and k-th row (opposite
to usual maths notation). Thus it’s safer to fill matrices by using a 16-
element single-subscripted array with elements arranged one column
after another, as in the above examples.

6. Matrix stack operations:


With the glMatrixMode(..) function, for each of its 4-modes (projection, modelview,
texture, color) OpenGL keeps a separate matrix stack. Initially, the top stack position in
each holds the 4×4 identity matrix. As particular mode operations are invoked, the
current matrix as a composition of these is put on the top of the particular stack.
Thus, the modelview matrix stack will hold the current matrix composed of all the
viewing and geometric transformations invoked so far. To allow for different viewing
combinations, typically a stack depth of 32 matrices is used The following operations
are useful in handling matrix stacks (highly efficient since hardware implemented):
a. glGetIntegerv(glConst, array);
e.g.
glGetIntegerv(GL_MAX_MODELVIEW_STACK_DEPTH, stackSize);
//returns single integer value to array stackSize. Similar symbolic GL_
constants are available for querying the other mode stacks.

glGetIntegerv(GL_MODELVIEW_STACK_DEPTH, numMats);
108

//returns single integer value to array numMats, giving the current number of
matrices loaded onto the modelview stack.

b. glPushMatrix( ); //copies the current matrix to the top of the active stack. The
//second position now contains a copy of the current.
c. glPopMatrix( ); //will destroy matrix at the top of the active stack, and makes
//the second matrix the current matrix.

5.15 OpenGL code examples


In the following we examine 3 outline-code versions, each of which is used to, a) translate b)
rotate and c) scale-reflect a blue rectangle to a resultant red one:

200

150
translated
100
original
50

-200 -150 -100 -50 50 100 150 200

200

150

100
rotated original
50

-200 -150 -100 -50 50 100 150 200

200
scale-reflected
150

100
original
50

-200 -150 -100 -50 50 100 150 200


109

//--------------------------------
//Version 1 - simple case
//--------------------------------
glMatrixMode (GL_MODELVIEW);
glColor3f (0.0, 0.0, 1.0);
glRecti (50, 100, 200, 150); // Display blue rectangle.
glColor3f (1.0, 0.0, 0.0);
glTranslatef (-200.0, -50.0, 0.0); // Set translation parameters.
glRecti (50, 100, 200, 150); // Display red, translated rectangle.
glLoadIdentity ( ); // Reset current matrix to identity.
glRotatef (90.0, 0.0, 0.0, 1.0); // Set 90-deg. rotation about z axis.
glRecti (50, 100, 200, 150); // Display red, rotated rectangle.
glLoadIdentity ( ); // Reset current matrix to identity.
glScalef (-0.5, 1.0, 1.0); // Set scale-reflection parameters.
glRecti (50, 100, 200, 150); // Display red, transformed rectangle.

//-------------------------------------------
//Version 2 – using matrix stacks
//------------------------------------------
glMatrixMode (GL_MODELVIEW);
glColor3f (0.0, 0.0, 1.0); // Set current color to blue.
glRecti (50, 100, 200, 150); // Display blue rectangle.
glPushMatrix ( ); // Make copy of identity (top) matrix.
glColor3f (1.0, 0.0, 0.0); // Set current color to red.
glTranslatef (-200.0, -50.0, 0.0); // Set translation parameters.
glRecti (50, 100, 200, 150); // Display red, translated rectangle.
glPopMatrix ( ); // Throw away the translation matrix.
glPushMatrix ( ); // Make copy of identity (top) matrix.
glRotatef (90.0, 0.0, 0.0, 1.0); // Set 90-deg. rotation about z axis.
glRecti (50, 100, 200, 150); // Display red, rotated rectangle.
glPopMatrix ( ); // Throw away the rotation matrix.
glScalef (-0.5, 1.0, 1.0); // Set scale-reflection parameters.
glRecti (50, 100, 200, 150); // Display red, transformed rectangle.

//---------------------------------------------------------
//Version 3 using composite transformations
//---------------------------------------------------------
class wcPt3D {
public:
GLfloat x, y, z;
};
\* Procedure for generating a matrix for rotation about an
* an axis defined with points p1 and p2.
*/
void rotate3D (wcPt3D p1, wcPt3D p2, GLfloat thetaDegrees)
{
/* Set up components for rotation-axis vector. */
float vx = (p2.x - p1.x);
float vy = (p2.y - p1.y);
float vz = (p2.z - p1.z);
/* Specfy translate-rotate-translate sequence in reverse order: */
glTranslatef (p1.x, p1.y, p1.z); // Move p1 back to original position.
/* Rotate about axis through origin: */
glRotatef (thetaDegrees, vx, vy, vz);
glTranslatef (-p1.x, -p1.y, -p1.z); // Translate p1 to origin.
}
/* Procedure for generating a matrix for a scaling
* transformation with respect to an arbitrary fixed point.
110

*/

void scale3D (GLfloat sx, GLfloat sy, GLfloat sz, wcPt3D fixedPt)
{
/* Specfy translate-scale-translate sequence in reverse order: */
/* (3) Translate fixed point back to original position: */
glTranslatef (fixedPt.x, fixedPt.y, fixedPt.z);
glScalef (sx, sy, sz); // (2) Scale with respect to origin.
/* (1) Translate fixed point to coordinate origin: */
glTranslatef (-fixedPt.x, -fixedPt.y, -fixedPt.z);
}
void displayFcn (void)
{
/* Input object description. */
/* Set up 3D viewing-transformation routines. */
/* Display object. */
glMatrixMode (GL_MODELVIEW);
/* Input translation parameters tx, ty, tz. */
/* Input the defining points, p1 and p2, for the rotation axis. */
/* Input rotation angle in degrees. */
/* Input scaling parameters: sx, sy, sz, and fixedPt. */
/* Invoke geometric transformations in reverse order: */
glTranslatef (tx, ty, tz); // Final transformation: Translate.
scale3D (sx, sy, sz, fixedPt); // Second transformation: Scale.
rotate3D (p1, p2, thetaDegrees); // First transformation: Rotate.
/* Call routines for displaying transformed objects. */
}
111

CHAPTER SIX: TWO-DIMENSIONAL VIEWING

Here we study the processes involved in depicting a 2D scene or model onto a 2D screen. We
thus consider the viewing pipeline from world coordinates to device coordinates.

6.1 The 2D viewing pipeline


The section of a scene that is to be shown on the screen is usually referred to as a clipping
window, since we require the unwanted portions to be discarded or clipped off. After this
section is mapped to device coordinates, its placement can be controlled within the display
window on the screen, by putting the mapped image of the clipping window into another
window known as the viewport. What is to be seen is selected by the clipping window, and
where on the output device it is shown, is the function of the viewport. In fact, several
clipping windows can be defined, with each one mapping to a separate viewport, either on one
device or distributed amongst many. For simplicity consider the case:

viewport
clipping window yvmax
ywmax

yvmin

ywmin

xwmin xwmax xvmin xvmax

World Coordinates Viewport Coordinates

Recall from Chapter One, that the steps involved in the complete wc → dc 2D viewing
transformation can be stated as:
i) Construct the scene in world coordinates, using modelling coordinates for each
part.
ii) Set up a 2D viewing system with an oriented window
iii) Transform to viewing coordinates
iv) Define a viewport in normalized (0..1 or -1...1) coordinates and map to it from the
view coordinates
v) Clip all parts outside the viewport
vi) Transform to device coordinates

When all the transformations are done, clipping can be done in normalized coordinates or
device coordinates. The clipping process is fundamental in computer graphics.

6.2 Defining the 2D clipping window


Various shapes of clipping windows may be employed, but we consider only rectangular ones.
Even here, we may want to create one with a particular orientation w.r.t. the WC system in
order to capture a particular view of the scene. To handle this we typically define a viewing
coordinate (VC) system containing the clipping window all within the WC system:
112

yworld yview
clipping window
v
y0
R
u
xview
T

x0 xworld

World Coordinates

We then transform this to a WC clipping window in the normal configuration before


transforming it to a viewport in device coordinates. To do this choose in WC a VC origin as
some convenient point P0 = ( x0 , y0 ) and a direction vector V for the yview direction. As an
alternative, we can specify a rotation angle w.r.t. the WC X (or Y axis). With the first method
the 2×2 rotation matrix R , say, is obtained by using unit vectors as follows:

V 
v= = (v x , v y ) = 2nd row of R 
V 
u = (ux , u y ) = (v y , − v x ) = 1st row of R 

However, first we require to translate the point P0 = ( x0 , y0 ) to the WC origin by means of a


translation matrix T. Thus the composite transformation from WC → VC is given by the matrix
M WC →VC = R ⋅ T . (6.1)
(For the full details see Chapter Five)

After this transformation, the following example illustrates the final outcome in WC:

yworld v yworld

y0 u

clipping window

x0 xworld xworld

6.3 Mapping the clipping window to a normalized viewport


Graphics packages use differing schemes in the mapping from window to normalized viewport
coordinates. In some this is one composite transformation resulting in a viewport coordinate
range of 0...1 i.e. the viewport lies in a unit square. Then after clipping the unit square is
mapped to the device coordinates. In other systems, normalization and clipping is done first
113

and then the viewport transformation is applied. In the latter the viewport is specified in screen
coordinates relative to the display window position.

Here we consider a viewport defined with normalized coordinates in the range 0...1:

1
viewport
clipping window yvmax
ywmax ( xv, yv )
( xw, yw) yv
yw yvmin

ywmin

xwmin xw xwmax 0 xvmin xv xvmax 1

world Coordinates normalized viewport coordinates

The window-to-viewport transformation is obtained by maintaining the relative displacements of


the point ( xw, yw) and its mapped image ( xv, yv ) in their respective windows, as follows:

xv − xvmin xw − xwmin 
=
xvmax − xvmin xwmax − xwmin 
 (6.2)
yv − yvmin yw − ywmin 
=
yvmax − yvmin ywmax − ywmin 

Re-arranging these relations give the equations in terms of scaling factors and translations
terms
xv = s x ⋅ xw + t x 
 (6.3)
yv = s y ⋅ yw + t y 

xvmax − xvmin 
sx =
xwmax − xwmin 
where  (6.4)
yvmax − yvmin 
sy =
ywmax − ywmin 

xwmax ⋅ xvmin − xwmin ⋅ xvmin 


tx = 
xwmax − xwmin 
and . (6.5)
ywmax ⋅ yvmin − ywmin ⋅ yvmax 
ty =
ywmax − ywmin 

Note that when sx = s y the relative proportions (x vs. y) of individual objects are maintained,
otherwise stretching or contraction would occur in a particular direction.
114

Another way to obtain the window-to-viewport transformation is to use the transformation


sequence (consult Chap. Five):-
1. Scale the clipping window to the viewport window size using the fixed point
( xwmin , ywmin ).
2. Translate ( xwmin , ywmin ) to ( xvmin , yvmin ) .

Thus (consult Chap. 5) for step 1 and we use the scaling matrix

 sx 0 xwmin (1 − sx ) 
S =0 sy ywmin (1 − s y )  (6.6)
 
 0 0 1 
and in step 2 we employ,
1 0 xvmin − xwmin 
T =  0 1 yvmin − ywmin  (6.7)
 
 0 0 1 
resulting in the composite wc → normalized viewport transformation matrix,

 sx 0 tx 
Μ wc.window →norm.viewport =T ⋅S = 0 sy ty  . (6.8)
 
 0 0 1 

which reproduces (6.3).


Clipping may be employed on either on the clipping window boundaries or on the viewport
boundaries. After this the unit square containing the viewport may be mapped to device
coordinates.

6.4 Mapping the clipping window to a normalized square


In this second approach, we transfer the clipping window to a normalized square, clip in
normalized coordinates, and then map the square to a viewport in screen coordinates.

screen viewport
clipping window normalized square yvmax
ywmax 1 ( xv, yv )
( xw, yw) yv
yw -1 1
yvmin
-1

ywmin
( xnorm , ynorm )
xwmin xw xwmax xvmin xv xvmax
world coordinates normalized coordinates screen coordinates

For this process we simply have to adjust the transformation equations of section 6.3 as
follows:
115

For wc → nc we put xvmin = yvmin = −1 and xvmax = yvmax = +1 into (6.6) – (6.8) to obtain

 2 xwmax + xwmin 
 xw − xw 0
xwmax − xwmin 
 max min

 2 ywmax + ywmin 
Μ wc.window →norm.square = 0 . (6.9)
ywmax − ywmin ywmax − ywmin 
 
 0 0 1 
 

For nc → vc, after applying clipping, we transform the normalized square of length 2 to the
screen viewport by we putting xwmin = ywmin = −1 and xwmax = ywmax = +1 into (6.6) – (6.8)
to obtain

 xvmax − xvmin xvmax + xvmin 


 0 
2 2
 
yvmax − yvmin yvmax + yvmin 
Μ norm.square→viewport = 0 . (6.10)
 2 2 
 0 0 1 
 
 

In choosing viewport sizes, it is important to maintain its aspect ratio as the same as the
original clipping window’s to avoid coordinate stretching and hence distortion in shapes. In the
final step we position the viewport in the display window. The usual technique is to set its
lower-left corner, at some position ( xs , ys ) relative to the lower-left position of the display
window, e.g.

100
s
c
window title here
r
e 50
e
n 300
ys
xs viewport
display window
400

Remarks:
• Character strings can be mapped by sending them through as bitmaps (for maintaining
constant character sizes), or by transforming the defining positions of outline fonts,
similar to any other primitive.
116

• Split-screen effects and multiple output device viewports can be achieved by


extensions of the above processes.

6.5 OpenGL functions for 2D viewing


OpenGL functions are generally designed for 3D transformation and viewing. Here a function is
available for defining viewports, and in GLU we can specify a 2D clipping window, whilst in
GLUT we can create display windows in screen coordinates.

The major steps and functions used are:-


Since there is no separate function for setting up a 2D viewing coordinate system in OGL, we
first need to set the mode for transforming from world coordinates to screen coordinates by:
glMatrixMode(GL_PROJECTION); //creates & makes current trans mat
//& set it to identity
with also glLoadIdentity(); //reqd if looping back for newer
matrices

Create a 2D clipping window with the GLU function


gluOrtho2D(xwmin, xwmax, ywmin, ywmax); //params are doubles in WCs
Specifies an orthogonal projection for mapping from WCs to screen coords ( || light rays from
scene points meet ⊥ XY-screen plane). The projection is onto a normalized square (-1...+1 in
x & y) on which the clipping routines act, discarding all parts of the scene outside it.

A viewport is created with


glViewport(xvmin, yvmin, vpWidth, vpHeight); //all params in int screen
//coords with xvmin,yvmin=lower left pos. rel. to display window
When not invoked the default viewport is taken as the display window. Multiple viewports may
also be defined.

The currently active viewport’s parameters are obtained with


glGetIntegerv(GL_VIEWPORT, vpArray);
where vpArray is a single-subscripted array holding the return values of xvmin, yvmin,
vpWidth, vpHeight.

Creating a GLUT display window is done by


glutInit(&argc, argv); //must call to initialize GLUT–can pass command line args
//like in C’s main(..)
glutInitWindowPosition(xTopLeft, yTopLeft); //sets position rel. to top left screen
//coordinates; default is -1,-1
glutInitWindowSize(dwWidth, dwHeight); //sets its size, default is 300 x 300

glutCreateWindow(“Title goes here”); //create it – but will not show until

To show window must first set up by


glutInitDisplayMode(mode); //set up for different modes, e.g.
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB); //single display buffer + RGB
//color scheme
glClearColor(red, green, blue, alpha); //clear to background RGB color or by
glClearIndex(index); //same for index mode, index→ position in table ~ color
117

Can use a +ve integer identfier for a display window at creation by


windowID = glutCreateWindow(“Title goes here”); //system starts with value 1

Can destroy a GLUT display window by


glutDestroyWindow(windowID);

To make a display window the current one (on which all subsequent ops performed)
glutSetWindow(windowID); //default is last created

To find which window is current, get its ID by


currentWindowID = glutGetWindow();

To reposition and re-size a current window


glutPositionWindow(xNewTopLeft, yNewTopLeft);
glutReshapeWindow(dwNewWidth, dwNewHeight);
glutFullScreen(); //makes the current expand to full screen size

To adjust for aspect ratio distortion in shapes on resizing can call


glutReshapeFunc(winReshapeFcn);
Activated when current window is resized, and new width and height values are passed as
parameters to void winReshapeFcn(int newWidth, int newHeight) = “call-back” function for
“resize event”

To reduce a current window to an icon on lower task bar call


glutIconifyWindow();

To change name for window icon use


glutSetIconTitle(“new window name”);

To change the name of a current window call


glutSetWindowTitle(“New Window Name”);

To make a particular window the top window (of many) call


glutSetWindow(windowID); //make the reqd id-ed one current
glutPopWindow(); //brings it on top

To push a particular window to the back (behind all others) call


glutSetWindow(windowID); //make the reqd id-ed one current
glutPushWindow(); //send it to the back

To hide a current window (take it off-screen) call


glutHideWindow();

To bring back a hidden or iconified window to a current one call


glutShowWindow();

To create a sub-window of a window with windowID call


glutCreateSubWindow(windowID, xBottomLeft, yBottomLeft, width, height);
This will get its own id, and can have its own mode (color etc) and parameters
118

To set a screen cursor shape over a current window call


glutSetCursor(shape); //only a request, since system dependent
Possibilities: shape = GLUT_CURSOR_UP_DOWN (up-down arrow),
GLUT_CURSOR_CYCLE (rotating arrow) etc see H&B p. 311.

To display something in current window use


glutDisplayFunc(pictureDescrip);
Here pictureDescrip(..) is a function that describes a scene or what is to be shown, and is the
“callback function”, which is automatically called by GLUT when the window contents are to be
renewed. It normally calls the OpenGL primitives and attributes required to construct the scene
and other required functions, such as menu constructors etc.
To display in other windows or icons we repeat the above.

To redisplay a window (if it may have been damaged after a glutPopWindow(.) for example)
call
glutPostRedisplay(); //redisplays a current window

To begin execution of a GLUT progarm use


glutMainLoop();
This will set the program in an infinite GLUT processing loop, “catching events”, such as mouse
clicks, window closing mouse presses, keyboard depressions etc. In the process GLUT will
determine when any window is to be redrawn etc.

Other GLUT function calls may be made


Various functions are available in this library such as
glutIdleFunc(function); //will call function() when no other processing being done
//used for doing background updates etc

glutGet(stateParam); //returns a system parameter ~ symbolic const stateParam

e.g. stateParam = GLUT_WINDOW_WIDTH, will return current w. width


glut functions for 3D shapes – see later

Example program:
This program will construct and display two triangles, each in its own viewport, with the second
one rotated through +900.
119

//OGLViewingProg2D.cpp
//--------------------------------
//Split-screen (dual viewport) example (H & B Chap 6)
#include <windows.h>
#include <GL/glut.h>
class wcPt2D {
public:
GLfloat x, y;
};

void init (void)


{
/* Set color of display window to white. */
glClearColor (1.0, 1.0, 1.0, 0.0);
/* Set parameters for world-coordinate clipping window. */
glMatrixMode (GL_PROJECTION);
gluOrtho2D (-100.0, 100.0, -100.0, 100.0);
/* Set mode for constructing geometric transformation matrix. */
glMatrixMode (GL_MODELVIEW);
}

void triangle (wcPt2D *verts)


{
GLint k;
glBegin (GL_TRIANGLES);
for (k = 0; k < 3; k++)
glVertex2f (verts [k].x, verts [k].y);
glEnd ( );
}

void displayFcn (void)


{
/* Define initial position for triangle. */
wcPt2D verts [3] = { {-50.0, -25.0}, {50.0, -25.0}, {0.0, 50.0} };
glClear (GL_COLOR_BUFFER_BIT); // Clear display window.
glColor3f (0.0, 0.0, 1.0); // Set fill color to blue.
glViewport (0, 0, 300, 300); // Set left viewport.
triangle (verts); // Display triangle.
/* Rotate triangle and display in right half of display window. */
glColor3f (1.0, 0.0, 0.0); // Set fill color to red.
glViewport (300, 0, 300, 300); // Set right viewport.
glRotatef (90.0, 0.0, 0.0, 1.0); // Rotate about z axis.
triangle (verts); // Display red rotated triangle.
glFlush ( );
}

int main (int argc, char ** argv)


{
glutInit (&argc, argv);
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
glutInitWindowPosition (50, 50);
glutInitWindowSize (600, 300);
glutCreateWindow ("Split-Screen Example");
init ( );
glutDisplayFunc (displayFcn);
glutMainLoop ( );
}
120

6.6 2D clipping
Clipping algorithms are used to remove unwanted portions of a picture – we either remove the
outside or the inside of a clipping window, which can be rectangular (most common),
polygonal or curved (complex). A study of the various algorithms for point clipping, line
clipping, polygon clipping, curved area clipping and text clipping is being deferred for later
study, if time permits, and in order to study other aspects of CG. If you are interested in this
subject you can obtain a set of (hand written) notes from me, or see H&B p. 315.
121

CHAPTER SEVEN: THREE-DIMENSIONAL VIEWING

Viewing a scene in 3D is much more complicated than 2D viewing, where in the latter the
viewing plane on which a scene is projected from WCs is basically the screen, except for its
dimensions. In 3D, we can choose different viewing planes, directions to view from and
positions to view from. We also have a choice in how we project from the WC scene onto the
viewing plane.

7.1 Basic concepts


In the process of viewing a 3D scene we set up a coordinate system for viewing, which holds
the viewing or “camera” parameters: position and orientation of a viewing or projection plane
(~ camera “film”). Then objects in the scene are projected onto the plane in two general ways:
• parallel projection - where lines from points in the scene hit the plane along parallel
paths (used in engineering drawings and drafting), or,
• perspective projection - where lines from points in the scene converge to a focus
point behind the plane and in the process intersect the plane forming the projected
image. This gives the realistic effect of making more distant objects look smaller.

In addition, depth cueing which involves colouring different surfaces or parts of them to give
the impression of different position depths, may be employed, and surface rendering,
lighting, visible surface detection and clipping algorithms come into play as well.

7.2 The 3D viewing pipeline


Generating a view of a 3D scene on an output device is similar to taking a photograph of it,
except that many more possibilities are open to us in the way the “camera” is positioned, its
aperture (view volume) is chosen, the orientation and position of the view plane is selected etc.
The following summarises the steps involved from the actual construction of a 3D scene to its
ultimate depiction on a device:
1. Construct objects in modelling coordinates (MCs)
2. Pass object description through the modelling transformation to a WC scene.
3. Pass scene description through the viewing transformation to view coordinates (VCs)
4. Pass through the projection transformation to projection coordinates (PCs)
5. Pass through the normalizing transformation and clipping algorithms to normalized
coordinates (NCs)
6. Pass through the viewport transformation to device coordinates (DCs)

7.3 The 3D viewing coordinate system


As in 2D we choose in WCs, an origin P0 = ( x0 , y0 , z0 ) for it, called the view point or viewing
position (also called the eye position or camera position in some packages). Then we
choose a view up vector V which defines its y-direction, yv and in addition a vector giving the
direction along which viewing is done defining its zv direction.

The view plane or projection plane is usually taken as a plane that is ⊥ zv -axis and is set at
a position zvp from the origin. Its orientation is specified by a choosing view-plane normal
vector N (w.r.t. WC system) which also specifies the direction of the positive zv direction. The
following right-handed systems are indicative of the set up typically employed.
122

yv xv
yw zv
view plane
N P0 = ( x0 , y0 , z0 )
Pref
VC system
xw

WC system
zw

The direction of viewing is usually taken as the − N (or − zv ) direction, for RH coordinate
systems (or in the opposite direction corresponding to LH coordinate systems).
Choosing the view-plane normal N :
• can take as “out from object” by taking N = P0 − OriginWC
• or, from a reference point Pref (“look at point”) in scene to P0 i.e. N = P0 − Pref
• or, define direction cosines for it using angles θ ,ϕ ,φ w.r.t. the WC X,Y,Z axes

Choosing the view-up vector V :


Require it to be ⊥ N , but since not easy to establish usually take
• V = (0,1,0) = WC Y direction and
• adjust or let code/package adjust

Forming the viewing coordinate frame:


Having chosen N we form the unit normal vector n, for the zv direction, form the unit vector u
for the xv direction, and then adjust V to get a new unit vector v for the yv direction, using
cross-products to obtain each one orthogonal to the plane of the other two:

N 
n= = ( nx , n y , nz ) 
N

V×n 
u= = (u x , u y , u z )  (7.1)
V 
v = n× u = ( v x , v y , v z ) 


We then call this system a uvn viewing-coordinate reference frame.

Setting up the view-plane:


Finally, the view-plane is chosen as a plane ⊥ n (or the zv -axis, at some point on it (at some
distance from the view-frame origin).
123

scene

yv − zv
another possible view plane

v P0 = ( x0 , y0 , z0 )
xv
n u
uvn viewing reference frame

a view plane
+ zv
viewing direction...towards object

7.4 Transforming from world to viewing coordinates


To project the scene description onto the view plane we must firstly transform from world
coordinates ( xw , yw , zw ) to view coordinates ( xv , yv , zv ) as follows:-
• translate the viewing reference origin P0 = ( x0 , y0 , z0 ) to the WC origin (0,0,0) via
the translation matrix
 1 0 0 − x0 
0 1 0 − y  xv′ = xw − x0 
 0 
T= , giving intermediate Pv′ = T ⋅ Pw , or yv′ = yw − y0  (7.2)
0 0 1 − z0  
  zv′ = zw − z0 
0 0 0 1 

• rotate the translated frame (intermediate coordinates Pv′ ), possibly requiring up to 3


rotations, about the coordinate axes with a composite rotation matrix, R = Rz ⋅ Ry ⋅ Rx
124

xv
yv VC
yw yw yw
P0
zv y ′ xv′ yvfinal
v
xw xvfinal
WC xw xw
zvfinal
zw zv′ zw
zw

T Rx Ry Rz

R = Rz ⋅ Ry ⋅ Rx

• Since we know the coordinate unit vectors for the VC system from (7.1), we can obtain
the final rotation matrix R by placing these vector components into its rows, by
extending the arguments of Chap. 5. Thus we can write,

 ux uy uz 0
v vy vz 0
R= x  (7.3)
 nx ny nz 0
 
0 0 0 1

• Then the final WC → VC transformation is given from (7.2) and (7.3) by

u x uy uz −u ⋅ P0 
v vy vz − v ⋅ P0 
M WC→ VC = R ⋅ T =  x  (7.4)
 nx ny nz −n ⋅ P0 
 
0 0 0 1 

−u ⋅ P0 = − x0u x − y0u y − z0u z 



where − v ⋅ P0 = − x0 v x − y0 v y − z0 v z  . (7.5)
−n ⋅ P0 = − x0 nx − y0 n y − z0 nz 

This transformation may be obtained by more direct calculations by considering direction


cosines of the VC coordinate unit vectors relative to the WC coordinate axes (see tutorial).

7.5 Projecting from viewing coordinates to the view plane


After the WC → VC transformation the points from the transformed 3D scene/object are then
projected onto the 2D view plane. Two basic methods used for this operation are:-
• parallel projection
125

Here coordinate positions are transformed along parallel lines, which meet the view
plane, thus forming the projected image. Used in drafting and accurate engineering
drawings.
• perspective projection
Here coordinate positions are transformed along lines that converge to a point behind
the view plane, called the projection reference point or centre of projection. The
image on the view plane is formed by the intersections of the lines with it. Thus distant
objects look smaller than nearer ones of the same sizes, adding realism in the view.

Examples of transformed lines in these two schemes are:-

P2

P1 view plane
P2

P2′
P2′ convergence point
P1
P1′
view plane P1′

parallel projection perspective projection

7.6 Orthogonal parallel projection


When the projection lines (direction) are parallel to the viewing axis zv and they meet the view
plane at right angles, then any point ( x, y , z ) in X vYv Z v coordinates → ( x p , y p ) on the view
screen by
x p = x, y p = y (7.6)

where now the z coordinate may be used for depth cueing, i.e. can colour each pixel in
proportion to z.
This transformation is also known by the term orthographic parallel projection.

An orthogonal-projection view volume is used to limit the volume extent of a scene that will
be orthogonally projected onto the view plane. This is formed by defining a rectangular clipping
window in the view plane and extending this to a rectangular parallelpiped in the zv by
enclosing it between the planes zv = znear and zv = z far which are parallel to the view plane.
126

yv
view plane
far plane
( x, y )
view volume

( x, y , z ) near plane

xv

Orthogonal Projection clipping window

zv Orthogonal Projection View Volume

Mapping to a normalized view volume is then done in most packages, where the normalized
view volume has coordinate extents the unit cube (coordinate ranges 0...1) or a symmetric
cube (coordinate ranges -1...+1). After this, the normalized volume is mapped to screen
coordinates, which generally are LH coordinate systems.

ynorm yscreen zscreen


yv znorm
( xwmax , ywmax , z far )
(1,1,1)

xnorm
( −1, −1, −1)
xv xscreen
( xwmin , ywmin , znear ) screen window viewport
zv

Here, for the coordinate systems shown position ( xwmin , ywmin , znear ) → (−1, −1, −1) and
position ( xwmax , ywmax , z far ) → (1,1,1) . The procedure for obtaining the transformation
matrix from ( xv , yv , zv ) → ( xnorm , ynorm , znorm ) follows the argument for (6.9), with the
additional requirement that we transform the range ( znear , z far ) → (−1,1). The matrix that
results is
 2 xwmax + xwmin 
 xw − xw 0 0
xwmax − xwmin 
 max min

 2 ywmax + ywmin 
 0 0
M ortho→norm = ywmax − ywmin ywmax − ywmin  . (7.7)

 −2 znear + z far 
 0 0 
 znear − z far znear − z far 
 0 0 0 1 
127

The complete transformation matrix from WC → normalized orthogonal-projection


coordinates then takes the form
M WC→ ortho.norm = M ortho→norm ⋅ R ⋅ T
where R ⋅ T is the composite viewing transformation matrix (7.4).
Finally, clipping processes, visibility testing, surface rendering etc are performed followed by
the viewport transformation to screen coordinates.

7.7 Oblique parallel projection


In this case parallel projection lines emanating from points in the object intersect the view plane
at some angle α relative to the plane surface, where the direction is usually specfied by a
parallel-projection view vector Vp :

object

Vp

view plane image on view plane

To obtain the transformation equations consider:

yv
view plane

( xp , y p ) xv
( x, y , z )
α
L φ
z
( x, y )

zv

Suppose that the point ( x, y , z ) → ( x p , y p ) on the view plane (at z = 0 ) and that ( x, y ) is its
orthographic projection on the plane.
Let α = angle between the line joining ( x, y ) and ( x p , y p ) on the plane and the line
( x, y , z ) → ( x p , y p ) .
Let φ = angle between the line joining ( x, y ) and ( x p , y p ) and the xv - axis. Then
x p = x + L cos φ 
 (7.8)
y p = y + L sin φ 
128

tan α = z / L . (7.9)
Thus L = z / tan α ≡ z L1 , where L1 = 1/ tan α = cot α (7.10)
and
x p = x + L1 z cos φ 
 (7.11)
y p = y + L1 z sin φ 

If the view plane was located at some position zvp on the zv - axis, then the above
transformation equations from VC to projected coordinates are simply affected by a translation,
so with zvp − z replacing z
x p = x + L1 ( zvp − z ) cos φ 

y p = y + L1 ( zvp − z )sin φ  (7.12)
z p = zvp 

and the transformation matrix is

1 0 − L1 cos φ zvp L1 cos φ


0 1 − L1 sin φ zvp L1 sin φ
M oblique = ; L1 = cot α . (7.13)
0 0 0 zvp
0 0 0 1

Remarks:
1. From (7.13) we recover the orthogonal projection (orthographic) case by setting
L1 = 0 or α = 900 .
2. The transformation (7.13) corresponds to the matrices for z-shears: planes of constant
z are sheared and projected onto the view plane. See H&B p. 365.
3. Typical values used in applications for the angle φ are φ = 300 , 450 giving a
combination of front, side, and top views (or front, side and bottom views).
4. Common values for α are α = 450 or tan α = 1 (cavalier projection) α ≈ 63.40 or
tan α = 2 (cabinet view). See H&B p.365.

Some graphics packages supply a WC→ VC transformation matrix in terms of the parallel-
projection view vector Vp . Then all WC points are projected onto the view plane with their
projection lines parallel to this vector. The matrix (7.13) can be written in terms of components
of Vp w.r.t. the view coordinate system. See H&B p.366. Finally, as in the case of orthogonal
projections we need to perform the VC → normalized → viewport transformations to obtain the
complete (composite) transformation from WC → screen viewport.
129

7.8 Perspective projection

view plane
( x, y , z ) yv view plane
xv
( x, y , z )
( x p , y p , zvp ) ( x p , y p , zvp )
( x prp , y prp , z prp )
zv
zv zvp vc origin z prp
vc origin

General perspective projection view View from xv axis onto yv zv plane

Consider the general point P = ( x, y , z ) being projected onto the view plane as ( x p , y p , zvp ) ,
along the line converging to the projection reference point ( x prp , y prp , z prp ) . With the view
plane ⊥ the zv axis of the VC system and located at zvp on it, the RH figure shows the situation
when viewed from the xv axis.
Now, along the perspective projection line, the equations determining any general point
P′ = ( x′, y ′, z ′) on it may be written in parametric form as
x′ = x − ( x − x prp )u 

y ′ = y − ( y − y prp )u  ; 0 ≤ u ≤ 1 (7.14)
z′ = z − ( z − z prp )u 
where for the parameter value u = 0 we recover the starting point P = ( x, y , z ) and for u = 1
the projection reference point ( x prp , y prp , z prp ).
Now, suppose we consider a P′ = ( x′, y ′, z ′) = ( x p , y p , zvp ) to be a point on the view plane,
i.e. z′ = zvp , then from the last of (7.14) we obtain at this position the parameter value
zvp − z
u= (7.15)
z prp − z

Substituting this into (7.14) gives the equations for the WC → VC perspective transformation
z −z   z −z  
x p = x  prp vp  + x prp  vp  
 z −z   
 prp   z prp − z  

 z prp − zvp   zvp − z  
yp = y  + y prp  (7.16)
 z − z   z − z  
 prp   prp  

z −z   z − z 
z p = zvp = z  prp vp  + z prp  vp
 z −z   z − z  
 prp   prp 
Notice that if we write this equation in matrix form, then the matrix elements will contain
functions of the original z-coordinate. This does not make the transformation matrix useful for
130

the purposes of concatenation with other matrices. We resolve this problem by invoking
homogeneous coordinates by setting:
Homogeneous parameter h = z prp − z (7.17)
xh y z
Projection-homogenous coordinate relations xp = , yp = h , zp = h (7.18)
h h h
Then (7.16) becomes

xh = x ( z prp − zvp ) + x prp ( zvp − z ) 



yh = y ( z prp − zvp ) + y prp ( zvp − z )  (7.19)

zh = z ( z prp − zvp ) + z prp ( zvp − z ) 
In matrix form this is

 xh   z prp − zvp 0 − x prp x prp z prp   x 


y   0 z prp − zvp − y prp y prp z prp   y 
Ph ≡  h  =     , or P = M
h persp ⋅ P (7.20)
 zh   0 0 − zvp zvp z prp   z 
   0 0 −1

z prp   1 
h 

where the VC perspective projection transformation matrix is

 z prp − zvp 0 − x prp x prp z prp 


 0 z prp − zvp − y prp y prp z prp 
M persp = . (7.21)
 0 0 − zvp zvp z prp 
 
 0 0 −1 z prp 

Special cases:
1. When the view plane = xv − yv (i.e. u - v ) plane, then put zvp = 0 in the above.
2. When the perspective projection reference point is on the zv axis, then
x prp = y prp = 0 .
3. When the perspective projection reference point is the VC origin then
x prp = y prp = z prp = 0 .
4. A very common choice is to take the view plane as the xv − yv (i.e. u - v ) plane with
the projection reference point on the zv axis, so that x prp = y prp = zvp = 0 , giving:
 z   z 
x p =  prp  x, y p =  prp  y (7.22)
z −z z −z
 prp   prp 
5. If the projection reference point is on the view plane, z prp = 0 , the whole scene maps
to a single point (show this), so in general we take the view plane to lie between the
scene and the projection reference point.
6. It is also possible to obtain the transformation WC → VC matrix (7.4) in terms of other
parameters, such as spherical coordinates ( r, φ ,θ ) for positioning the viewing frame:
(See tutorial)
131

zw yv xv
viewing
direction (-N)
z0 zv
VC
frame

r P0 = ( x0 , y 0 , z0 )
WC φ
frame y0
yw
θ r sin φ
x0

xw

Some interesting consequences of perspective viewing of 3D scenes are:-


1. Any set of parallel lines not ⊥ the view plane map to converging lines, converging to
“a vanishing point”. See H&B p. 370. For example, for a cube we have 3 “principal
vanishing points” corresponding to its three principal axes.
2. Any set of parallel lines ⊥ the view plane map to parallel lines.

The perspective projection view volume


Corresponding to a camera lens aperture, a 3D view volume can be defined by means of a
rectangular clipping window, but with lines from the scene to the projection reference point now
forming an infinite pyramid which is cut by the clipping window on the view plane. Points in
the scene outside this pyramid or cone of vision cannot be seen (or displayed). A variation of
this volume involves adding near and far plane cut-offs, resulting in a truncated pyramid or
frustum view volume.

far clipping plane

near clipping plane

clipping window

projection
reference point

frustum view
volume

The frustum view volume is termed symmetric when the projection line from the projection
reference point, through the centre of the clipping window is ⊥ view plane, otherwise it is called
132

an oblique frustum. See H&B pp. 374-380 for more detail, where in particular they obtain the
transformation matrix to such a volume by concatenating M persp above with a shearing matrix
M zshear i.e. M oblique.persp = M persp ⋅ M zshear . Thereafter, the frustum volume is mapped to a
normalized view volume and a viewport in screen coordinates giving a concatenated matrix
M norm.persp for the WC
The complete transformation matrix from WC → normalized perpective-projection
coordinates then takes the form
M WC→norm.perps = M persp→norm ⋅ R ⋅ T
where R ⋅ T is the composite viewing transformation matrix (7.4).
Finally, clipping processes, visibility testing, surface rendering etc are performed followed by
the viewport transformation to screen coordinates, as in the case of oblique parallel projection.
See H&B p. 380-382. The rectangular viewport on the screen is set up as in the 2D case.

Note also that in (2D) screen coordinates, the z-coordinate value which is not used in any
position plotting, can be used for depth processing (variation of colour/intensities according to a
point’s depth in a scene etc). Typically, the screen buffer holds the (x,y) pixel values
corresponding to the 2D (x,y) position on the screen and additionally, a depth buffer holds the
depth cueing information for that position.

7.9 OpenGL 3D viewing functions


The 3D viewing functions are available in the GLU library, whilst functions for orthogonal
projections, oblique perspective projection, and viewport transformations are given in basic
OpenGL. GLUT, of course, provides functions for manipulating windows and some 3D
geometric modelling.

The major steps and functions used are:-


Geometric transformations on objects are set up with
glMatrixMode(GL_MODELVIEW); //initialises matrix to identity
Subsequent geometric operations (rotations, translations etc) will concatenate with it to give a
composite transformation matrix.

Viewing parameters (for the viewing transformation matrix) are supplied with:

gluLookAt(x0,y0,z0,xref,yref,zref,Vx,Vy,Vz); //all double parameters


where
P0 = (x0,y0,z0) = VC origin in world coordinate values
Pref = (xref,yref,zref) = reference position (“look-at point”) in scene given in WCs
= convenient “camera aiming” position in scene
V = (Vx,Vy,Vz) = view-up vector for Y-axis of VC system
N = P0 - Pref = vector for the +ve Zview direction with viewing in direction - Zview.
P0 = (0,0,0), Pref = (0,0,-1), V = (0,1,0) = defaults when gluLookAt(.) not invoked.

These parameters are used in the subsequent setting up of the viewing transformation matrix.
133

A projection transformation matrix is set up with:

glMatrixMode(GL_PROJECTION); //sets up identity matrix


Subsequent transformation command matrices will be concatenated with the current projection
matrix.

An orthogonal projection is chosen with


glOrtho(xwmin, xwmax, ywmin, ywmax, dnear, dfar); //all double params
where
xwmin,...ywmax = clipping window extents in WCs
dnear, dfar = near and far plane positions from the VC origin in WCs = distances in
- Zview direction, e.g. dfar = 40.0 ⇒ far clipping plane is at zfar = -40.0.
near clipping plane = also view plane, i.e. clipping window on near plane of view
volume!
Default values, when not called are the same as with
glOrtho(-1.0, 1.0, -1.0, 1.0, -1.0, 1.0); //a symmetric normalized cube

The glOrtho(.) sets up a parallel projection perpendicular to the view plane (=near clipping
plane).
The call glOrtho(.,.,.,dnear = -1.0, dfar = 1.0) in a 2D situation is equivalent to calling
gluOrtho2D(.).

An oblique parallel projection function is not available in OpenGL (yet!). Can write your own by
using (7.13) – see sect. 7.7 above.

A symmetric perspective-projection frustum view volume is set up with

gluPerspective(theta, aspect, dnear, dfar); //all params are doubles


where
projection ref. point (Pref ) = VC origin (P0)
near clipping plane = view plane
theta (0...1800) = the angle between the top and bottom clipping planes
= “field-of-view” angle
aspect = width/height aspect ratio for the clipping window
dnear, dfar = +ve doubles giving positions znearr = - dnear, and zfar = - dfar
respectively

The frustum volume is symmetric about the Zview axis and the scene description is transformed
to normalized, homogeneous projection coordinates.

A general perspective-projection frustum volume (symmetric or oblique) is set up with


glFrustum(xwmin, xwmax, ywmin, ywmax, dnear, dfar); //all doubles

where
projection ref. point (Pref ) = VC origin (P0) – same as above
near clipping plane = view plane – same as above
xwmin,...ywmax = clipping window extents on near plane.
If xwmin = -xwmax, ywmin = - ywmax then have symmetric frustum with - Zview axis
as its centre line.
Default is orthogonal projection with symmetric cube as the view volume.
134

After the composite transformation (geometric + viewing), the clipping transformations routines
are applied in normalized coordinates, and then mapped to 3D screen coordnates (with Z-
coordinate is for depth information) with a viewport transformation.

A rectangular viewport in screen coordinates is set up with


glViewport(xvmin, yvmin, vpWidth, vpHeight); //all params in int screen
//coords with xvmin,yvmin=lower left pos. rel. to display window

To retain the same proportions of objects as in the WC scene, we set its aspect ratio as that of
the clipping window. When not invoked the default viewport is taken as the display window.
Multiple viewports may also be defined – see the 2D case.

7.9 3D clipping algorithms


Will not be covered here: see H&B p. 389 – 398. Note that some of the 2D clipping algorithms
can be extended to 3D.
In addition to the 6 rectangular clipping planes given in the above OpenGL allows one to define
additional clipping planes, each of any orientation by:
glClipPlane(id, planeParameters); //define it with some id, planeParameters =
//4 el. array of consts. A,B,C,D ~ equation for a plane
glEnable(id); //makes it active
glDisable(id); //switches it off

7.10 Code example: Perspective view of a square


//PerspViewSquare.cpp
//-----------------------------
#include <windows.h>
#include <GL/glut.h>

GLint winWidth = 600, winHeight = 600; // Initial display-window size.


GLfloat x0 = 100.0, y0 = 50.0, z0 = 50; // Viewing-coordinate origin.
GLfloat xref = 50.0, yref = 50.0, zref = 0.0; // Look-at point.
GLfloat Vx = 0.0, Vy = 1.0, Vz = 0.0; // View up vector.
/* Set coordinate limits for the clipping window: */
GLfloat xwMin = -40.0, ywMin = -60.0, xwMax = 40.0, ywMax = 60.0;
/* Set positions for near and far clipping planes: */
GLfloat dnear = 25.0, dfar = 125.0;

void init (void)


{
glClearColor (1.0, 1.0, 1.0, 0.0);
glMatrixMode (GL_MODELVIEW);
gluLookAt (x0, y0, z0, xref, yref, zref, Vx, Vy, Vz);
glMatrixMode (GL_PROJECTION);
glFrustum (xwMin, xwMax, ywMin, ywMax, dnear, dfar);
}

void displayFcn (void)


{
glClear (GL_COLOR_BUFFER_BIT);
/* Set parameters for a square fill area. */
glColor3f (0.0, 1.0, 0.0); // Set fill color to green.
glPolygonMode (GL_FRONT, GL_FILL);
glPolygonMode (GL_BACK, GL_LINE); // Wire-frame back face.
135

glBegin (GL_QUADS);
glVertex3f (0.0, 0.0, 0.0);
glVertex3f (100.0, 0.0, 0.0);
glVertex3f (100.0, 100.0, 0.0);
glVertex3f (0.0, 100.0, 0.0);
glEnd ( );
glFlush ( );
}

void reshapeFcn (GLint newWidth, GLint newHeight)


{
glViewport (0, 0, newWidth, newHeight);
winWidth = newWidth;
winHeight = newHeight;
}

int main (int argc, char** argv)


{
glutInit (&argc, argv);
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
glutInitWindowPosition (50, 50);
glutInitWindowSize (winWidth, winHeight);
glutCreateWindow ("Perspective View of A Square");
init ( );
glutDisplayFunc (displayFcn);
glutReshapeFunc (reshapeFcn);
glutMainLoop ( );
}

Code outputs:
136

7.11 Code example: Perspective view of a cube


//CubePerspViewing3D.cpp
//----------------------------------
#include <windows.h>
#include <GL/glut.h>
#define SCREEN_WIDTH 800 // We want our screen width 800 pixels
#define SCREEN_HEIGHT 600 // We want our screen height 600 pixels
#define SCREEN_DEPTH 16 // We want 16 bits per pixel
int width = 800;
int height = 600;

void CreateCube(float x, float y, float z, int radius)


{
// Here we create 6 QUADS (Rectangles) to form a cube
// With the passed in radius, we determine the width and height of the cube
// some color added at each vertice to make it more interesting
glBegin(GL_QUADS);
// These vertices create the Back Side
glColor3ub(0, 0, 255); glVertex3f(x, y, z);
glColor3ub(255, 0, 255); glVertex3f(x, y + radius, z);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y + radius, z);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y, z);

// These vertices create the Front Side


glColor3ub(0, 0, 255); glVertex3f(x, y, z + radius);
glColor3ub(255, 0, 255); glVertex3f(x, y + radius, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y + radius, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y, z + radius);

// These vertices create the Bottom Face


glColor3ub(0, 0, 255); glVertex3f(x, y, z);
glColor3ub(255, 0, 255); glVertex3f(x, y, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y, z);

// These vertices create the Top Face


glColor3ub(0, 0, 255); glVertex3f(x, y + radius, z);
glColor3ub(255, 0, 255); glVertex3f(x, y + radius, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y + radius, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y + radius, z);

// These vertices create the Left Face


glColor3ub(0, 0, 255); glVertex3f(x, y, z);
glColor3ub(255, 0, 255); glVertex3f(x, y, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x, y + radius, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x, y + radius, z);

// These vertices create the Right Face


glColor3ub(0, 0, 255); glVertex3f(x + radius, y, z);
glColor3ub(255, 0, 255); glVertex3f(x + radius, y, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y + radius, z + radius);
glColor3ub(0, 255, 255); glVertex3f(x + radius, y + radius, z);
glEnd();
}
137

void init(void)
{
glViewport(0,0,width,height); // Make our viewport the whole window
glMatrixMode(GL_PROJECTION); // Select The Projection Matrix
glLoadIdentity(); // Reset The Projection Matrix
gluPerspective(45.0f,(GLfloat)width/(GLfloat)height, .5f ,150.0f);
glMatrixMode(GL_MODELVIEW); // Select The Modelview Matrix
glLoadIdentity(); / Reset The Modelview Matrix
glClearColor (1.0, 1.0, 1.0, 0.0);
}

void displayFcn (void)


{
glClear(GL_COLOR_BUFFER_BIT);
glLoadIdentity(); // Reset The matrix
// VC org look at pt Vup
gluLookAt(10, 10, 5, 0, 0, 0, 0, 1, 0); // This determines where the camera's position and view is
CreateCube(-1, -1, -1, 4);
glFlush( );
}

void reshapeFcn (GLint newWidth, GLint newHeight)


{
glMatrixMode(GL_PROJECTION); // Select The Projection Matrix
glLoadIdentity(); // Reset The Projection Matrix
gluPerspective(45.0f,(GLfloat)newWidth/(GLfloat)newHeight, .5f ,150.0f);
glMatrixMode(GL_MODELVIEW); // Select The Modelview Matrix
glLoadIdentity();
glViewport(0, 0, newWidth, newHeight);
}

int main (int argc, char** argv)


{
glutInit (&argc, argv);
glutInitDisplayMode(GLUT_SINGLE | GLUT_RGB);
glutInitWindowPosition(50, 50);
glutInitWindowSize(width, height);
glutCreateWindow("Perspective View of A Cube");
init( );
glutDisplayFunc(displayFcn);
glutReshapeFunc(reshapeFcn);
glutMainLoop( );
}
138

Code outputs:
139

CHAPTER EIGHT: REPRESENTATION OF 3D OBJECTS

Various tools or components are used in modelling objects, such as:


a. polygon and quadric surfaces
- used for simple Euclidean shapes e.g. polyhedrons, ellipsoids etc
b. spline surfaces
- used for other regular shapes like aircraft, gears, curved engineering
structures etc
c. procedural methods
- use of mathematical functions to generate points e.g. fractal construction
of particle systems for gas, fluids, clouds, leaves, trees etc
d. physically based modelling
- use physical laws for simulating behaviour of non-rigid objects like cloth,
glob of jelly etc

For the representation of solid objects, the methods used may be put into two classes
i) boundary representations (B-reps) – give a 3D object as a set of surfaces,
separating the object interior from its exterior
ii) space partitioning representations – used to describe interior properties by
partitioning the spatial region of an object with small, non-overlapping continuous
solids (cubes)

8.1 Using polygon surfaces


The most common approach to constructing 3D objects is through the use of polygon surfaces
(B-reps). For example, a cylinder may be constructed with a polygon mesh approximation by
“tiling”:

The polygon surface descriptions may be held in polygon tables, which may be a composition
of geometric data tables (containing vertices and other properties such as their orientation)
and attribute data tables (containing parameters for setting colour transparencies, surface
reflectivity, texture etc). Such a table was discussed in sect 3.9.4. Recall also, how to represent
the polygon surfaces by plane equations, in addition to which normal vectors may be used to
define plane surface orientations (sect 3.9.5).

Typically packages will provide mesh functions for tiling. Two common function types employ
triangles and quadrilaterals.
140

Given n vertices function will Given n×m array of vertices function


construct n-2 connected triangles will construct n-1 × m-1 quadrilaterals

8.2 OpenGL & GLUT polyhedron functions


The OpenGL polygon primitives of Chapter 3 may be used to generate polyhedra and surface
meshes. Additionally, GLUT functions are available for certain regular polyhedra.
OpenGL functions:
Recall the symbolic primitive constants GL_POLYGON, GL_TRIANGLES,
GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN, GL_QUAD, GL_QUAD_STRIP which can be
used to generate the required primitives. For example, GL_QUAD_STRIP can be used to tile
the axial surface of a cylinder.

GLUT functions for regular polyhedra:


1. glutWireTetraHedron(); or glutSolidTetraHedron();
- constructs 4-sided regular triangular pyramid in wire-frame and solid forms resp. Centre is
at WC origin, and radius = distance from centre to any vertex = √3.
2. glutWireCube(edgeLength); or glutSolidCube(edgeLength);
- constructs a 6-sided regular polyhedron or hexahedron or cube. Centre is at WC origin,
and edgeLength = +ve cube length of type double.
3. glutWireOctahedron(); or glutSolidOctahedron();
- constructs a 8-sided regular polyhedron. Centre is at WC origin, and distance from centre
to a vertex = 1.
4. glutWireDodecahedron(); or glutSolidDodecahedron();
- constructs a 12-sided regular polyhedron. Centre is at WC origin, and distance from centre
to a vertex = 1.
5. glutWireIcosahedron(); or glutSolidIcosahedron();
- constructs a 20-sided regular polyhedron. Centre is at WC origin, and distance from centre
to a vertex = 1.

GLUT example program

//Polyhedra.cpp
//-------------------
#include <windows.h>
#include <GL/glut.h>
GLsizei winWidth = 500, winHeight = 500; // Initial display-window size.

void init (void)


{
glClearColor (1.0, 1.0, 1.0, 0.0); // White display window.
141

void displayWirePolyhedra (void)


{
glClear (GL_COLOR_BUFFER_BIT); // Clear display window.
glColor3f (0.0, 0.0, 1.0); // Set line color to blue.
/* Set viewing transformation. */
gluLookAt (5.0, 5.0, 5.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0);
/* Scale cube and display as wire-frame parallelepiped. */
glScalef (1.5, 2.0, 1.0);
glutWireCube (1.0);
/* Scale, translate, and display wire-frame dodecahedron. */
glScalef (0.8, 0.5, 0.8);
glTranslatef (-6.0, -5.0, 0.0);
glutWireDodecahedron ( );

/* Translate and display wire-frame tetrahedron. */


glTranslatef (8.6, 8.6, 2.0);
glutWireTetrahedron ( );

/* Translate and display wire-frame octahedron. */


glTranslatef (-3.0, -1.0, 0.0);
glutWireOctahedron ( );

/* Scale, translate, and display wire-frame icosahedron. */


glScalef (0.8, 0.8, 1.0);
glTranslatef (4.3, -2.0, 0.5);
glutWireIcosahedron ( );
glFlush ( );
}

void winReshapeFcn (GLint newWidth, GLint newHeight)


{
glViewport (0, 0, newWidth, newHeight);
glMatrixMode (GL_PROJECTION);
glFrustum (-1.0, 1.0, -1.0, 1.0, 2.0, 20.0);
glMatrixMode (GL_MODELVIEW);
glClear (GL_COLOR_BUFFER_BIT);
}

int main (int argc, char** argv)


{
glutInit (&argc, argv);
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
glutInitWindowPosition (100, 100);
glutInitWindowSize (winWidth, winHeight);
glutCreateWindow ("Wire-Frame Polyhedra");
init ( );
glutDisplayFunc (displayWirePolyhedra);
glutReshapeFunc (winReshapeFcn);
glutMainLoop ( );
}

Output:
142

8.3 Curved surfaces


Curved shaped objects can be generated by mathematical functions or from a user supplied
set of data points. Packages usually supply a set of functions to choose from. If only data
points are given, then the package may supply a function to “fit” the data in various ways (e.g. a
least squares fit). Below we consider some mathematical methods for curved objects.

8.4 Quadric surfaces


Quadric surfaces are surfaces described by 2nd order (quadratic) equations such as spheres,
ellipsoids, tori, paraboloids and hyperboloids. They are used as common elements in the
construction of 3D scenes.
Sphere
A spherical surface with radius r and centred at the origin is the set of points ( x, y , z )
satisfying the equation
x2 + y 2 + z2 = r2. (8.1)

A parametric form of the surface equations in terms of ( r,θ , φ ) is


x = r cos φ cos θ 
 π π
y = r cos φ sin θ  ; − ≤ φ ≤ , − π ≤ θ ≤ π (8.2)
z = r sin φ  2 2

corresponding to the first figure below, or they may be given in terms of the co-latitude angle φ
chosen in the second as
x = r sin φ cosθ 

y = r sin φ sin θ  ; 0 ≤ φ ≤ π , 0 ≤ θ ≤ 2π . (8.3)
z = r cos φ 

143

z z
P = ( x, y , z ) P = ( x, y , z )

r r

φ y φ y
θ r cos φ
Y θ r sin φ

x x

Ellipsoid
Z

rz

ry
Y
rx
X

This may be regarded as an extension of a sphere, with 3 radii, rx , ry , rz along each of the XYZ
axes and centred at the origin. Any point ( x, y , z ) on its surface satisfies the defining equation
2 2 2
x  y z
  +   +   = 1. (8.4)
 rx   ry   rz 

A parametric form of the surface equations in terms of ( r,θ , φ ) , as in (8.2), is


x = rx cos φ cos θ 
 π π
y = ry cos φ sin θ  ; − ≤ φ ≤ , − π ≤ θ ≤ π (8.5)
z = rz sin φ  2 2

Torus
This is a doughnut shaped object generated by rotating a circle (or other closed conic) about a
fixed axis. For example, a torus with circular cross-section centred at (0,0,0) in the XY plane
looks like:
144

(0, y , z )

Y O
O Y
θ
( x, y , z )
raxial

X
Side View (from X-axis) Top View (from Z-axis)

From the side view we see that the equation for the cross-section circle is
( y − raxial )2 + z 2 = r 2 .
When rotated about the Z-axis, the surface of a torus is generated, where a point ( x, y , z ) on it
satisfies the defining equation
( x 2 + y 2 − raxial )2 + z 2 = r 2 . (8.6)
In parametric form the equation for a torus with the above circular cross-section is

x = ( raxial + r cos φ ) cosθ 



y = ( raxial + r cos φ )sin θ  ; − π ≤ φ ≤ π , − π ≤ θ ≤ π . (8.7)
z = r sin φ 

Other forms of tori are possible: for instance, we can rotate an elliptical cross-section instead of
a circular one.

8.5 Superquadrics
Surfaces may be generated by generalizations of quadratics with additional parameters,
resulting in “superquadrics”.
Superellipse
Based on the equation of an ellipse but in the generalized form:
2 2
 x s  y s
  +   = 1 (8.8)
 rx   ry 
where s is a real number. For s = 1 we have an ordinary ellipse. With rx = ry the following are
some shapes generated with the indicated s-values
145

s = 0.5 s = 1.0 s = 1.5 s = 2.0 s = 3.0

Note also that the parametric form of the equation for a superellipse is

x = rx cos s θ 
; − π ≤ θ ≤ π (8.9)
y = ry sin s θ 

Superellipsoid
This is a generalization of the ellipsoid and is defined by the equation
s2
2
  2 / s2
 y
2 / s2  s1
 z  s1
 x  +   +   = 1; s1 , s2 = real (8.10)
 r   
 x   ry    rz 

with parametric form


x = rx cos 1 φ cos 2 θ 
s s

s s  π π
y = ry cos 1 φ sin 2 θ  ; − ≤ φ ≤ , −π ≤ θ ≤ π (8.11)
s  2 2
z = rz sin 1 φ 

For examples of shapes generated by the above see H&B p. 412. Superquadrics are useful
modelling components for composing more complex structures.

8.6 OpenGL, GLU and GLUT functions for quadric and cubic surfaces
1. glutWireSphere(r, nLongitudes, nLatitudes); or glutSolidSphere(...);
- constructs a sphere with radius r; nLatitudes,nLongitudes = number of latitude and
longitude lines. Surface is a quad mesh approximation. Defined in modelling coords with
centre at WC origin.
2. glutWireCone(rbase, height, nLongitudes, nLatitudes); or glutSolidCone(...);
- constructs a cone with axis along Z. Defined in modelling coords with centre at WC origin.
3. glutWireTorus(rCrossSection, rAxial, nConcentrics, nRadialSlices);
glutSolidTorus(...);
- constructs a cone by generating a circle in XY plane about Z axis. nConcentrics =
number of concentric circles with centre on Z axis for torus surface, nRadialSlices =
number of radial slices through torus surface. Centre is at WC origin and its axis is along Z.
4. glutWireTeapot(size); or glutSolidTeapot(...); ****NOT AVAILABLE IN OUR GLUT***
- constructs a teapot as a mesh of > 1000 bicubic surface patches (generated by Bezier
curve functions – see later). size = max radius for teapot bowl. Centre is at WC origin, and
vertical axis is along Y.
Historical note: The data set for the original teapot is due to Martin Newell (1975) developed
at the University of Utah. This and other well known 3D shapes are used to test many
proposed and published techniques for surface rendering in Computer Graphics, and now
form a classical set of test problems.
146

5. GLU quadric surface functions


- called through a sequence of steps such as:
GLUquadricObj *sphere1; //define a name (obj ref) for it
sphere1 = gluNewQuadric( ); //activate quadric renderer for it
gluQuadricDrawStyle(sphere1, GLU_LINE); //display mode wire-frame
// (GLU_LINE)
gluSphere(sphere1, r, nLongitudes, nLatitudes); //create it with polygon facets

Other display modes are set with GLU_POINT (surface is a point plot), GLU_SILHOUTTE
(removes shared edges), GLU_FILL (patches are shaded filled areas).
Other shape drawing functions in place of gluSphere(...) can be:
gluCylinder(quadricName, rBase, rTop, height, nLongitudes, nLatitudes);
gluDisk(ringName, rInner, rOuter, nRadii, nRings);
gluPartialDisk(ringName, rInner, rOuter, nRadii, nRings, startAngle, sweepAngle);

In addition we have:
gluDeleteQuadric(quadricName); //frees memory when not required
gluQuadricOrientation(quadricName, normalVectorDirection);
//sets normal vector direction wirh
2nd param = GLU_OUTSIDE (sets normal out to front-face direction)
2nd param = GLU_INSIDE (sets normal in to back-face direction)
gluQuadricNormals(quadricName, generationMode); //generates surface normals
generationMode = GLU_NONE = no normals – typically no lighting applied
= GLU_FLAT = same normal for each face – same colour
= GLU_SMOOTH = normal vector for each surface vertex
gluQuadricCallback(quadricName, GLU_ERROR, function);
// a callback routine, calls function if error in creation etc

8.7 Example programs

//QuadricSurfs.cpp
//-----------------------
#include <windows.h>
#include <GL/glut.h>
GLsizei winWidth = 500, winHeight = 500; // Initial display-window size.

void init (void)


{
glClearColor (1.0, 1.0, 1.0, 0.0); // Set display-window color.
}

void wireQuadSurfs (void)


{
glClear (GL_COLOR_BUFFER_BIT); // Clear display window.
glColor3f (0.0, 0.0, 1.0); // Set line-color to blue.
/* Set viewing parameters with world z axis as view-up direction. */
gluLookAt (2.0, 2.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0);
/* Position and display GLUT wire-frame sphere. */
glPushMatrix ( );
glTranslatef (1.0, 1.0, 0.0);
glutWireSphere (0.75, 8, 6);
glPopMatrix ( );
147

/* Position and display GLUT wire-frame cone. */


glPushMatrix ( );
glTranslatef (1.0, -0.5, 0.5);
glutWireCone (0.7, 2.0, 7, 6);
glPopMatrix ( );
/* Position and display GLU wire-frame cylinder. */
GLUquadricObj *cylinder; // Set name for GLU quadric object.
glPushMatrix ( );
glTranslatef (0.0, 1.2, 0.8);
cylinder = gluNewQuadric ( );
gluQuadricDrawStyle (cylinder, GLU_LINE);
gluCylinder (cylinder, 0.6, 0.6, 1.5, 6, 4);
glPopMatrix ( );
glFlush ( );
}

void winReshapeFcn (GLint newWidth, GLint newHeight)


{
glViewport (0, 0, newWidth, newHeight);
glMatrixMode (GL_PROJECTION);
glOrtho (-2.0, 2.0, -2.0, 2.0, 0.0, 5.0);
glMatrixMode (GL_MODELVIEW);
glClear (GL_COLOR_BUFFER_BIT);
}

int main (int argc, char** argv)


{
glutInit (&argc, argv);
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
glutInitWindowPosition (100, 100);
glutInitWindowSize (winWidth, winHeight);
glutCreateWindow ("Wire-Frame Quadric Surfaces");
init ( );
glutDisplayFunc (wireQuadSurfs);
glutReshapeFunc (winReshapeFcn);
glutMainLoop ( );
}
148

This is the type of output from a glutWireTeapot(size) function call:


149

8.8 Blobby objects


“Blobby” objects are those that exhibit non-rigidity or degrees of fluidity. They can be built from
Gaussian shapes
− ak rk2
f ( x, y , z ) ≡ ∑ bk e −T = 0 (8.12)
k

where rk2 = xk2 + yk2 + zk2 , ak , bk are constants that adjust the amount of blobbiness (e.g.
bk < 0 gives dents instead of bumps) and T is a threshold parameter. A 3D bump, centred at 0
with standard deviation a and height b looks like:

−a a

A composite blobby object can be constructed by merging basic Gaussian ones:

composite shape

Other methods use density functions that fall-off to zero on a finite interval rather than decay
exponentially. For example, the metaball model is a combination of quadratic density
functions of the form:

  3r 2 
b  1 − 2  , 0 < r ≤ d / 3
  d 
 3 r
2

f (r) =  b 1 −  , d / 3 ≤ r ≤ d (8.13)
2  d 
0, r > d


A soft object model uses
 22 r 2 17r 4 4 r 6
1 − + − ,0<r≤d
f (r) =  9 d 2 9 d 4 9d 6 (8.14)
0, r > d
150

8.9 Spline representations


A spline is a curve composed of polynomial pieces which join at positions called knots in such
a way that there is a specified overall continuity in the curve at all points up to and including
some order in its derivatives. Typical pieces are cubic polynomials, which give the spline
continuity up to and including 2nd order in the derivatives. Cubic splines are commonly used in
Computer Graphics.

A spline surface can be generated by two sets of spline curves, where each set intersects the
other orthogonally.

Cubic splines in Computer Graphics are generated by specifying a set of control points, and
choosing whether the curve passes through all the points (interpolation spline) or the curve
passes approximately through the points in a smooth sense (approximating spline):

interpolating spline approximating spline

The set of control points may be imagined to enclose a region of space which is a convex
polygon or convex hull, where every point is either on its boundary or lies inside.

Parametric continuity conditions


An overall curve may be obtained by joining separate spline pieces, each specified by
parametric equations of the form
x = x (u ), y = y (u ), z = z (u ); u1 ≤ u ≤ u2 (8.15)

q1 (u ) q2 (u )
u∗

Various degrees of smoothness can be obtained by setting continuity conditions for every pair
of curves q1 (u ) , q2 (u ) at the join u ∗ as follows:

i) C 0 continuity (zero order parametric continuity):


q1 (u ∗ ) = q2 (u ∗ ) q1 (u ) q2 (u )

ii) C (1) continuity (1st order parametric continuity):


q1 (u ∗ ) = q2 (u ∗ ), q2 (u )
q1′(u ∗ ) = q2′ (u ∗ )

q1 (u )
151

iii) C (2) continuity (2nd order parametric continuity):


q1 (u ∗ ) = q2 (u ∗ ),
q1′ (u ∗ ) = q2′ (u ∗ ),
q1′′(u ∗ ) = q2′′(u ∗ ) q1 (u )
q2 (u )

Typically C (2) is required for animation paths.

Geometric continuity conditions


These are generally weaker conditions to impose – we make the parametric derivatives of the
two sections at a knot, proportional to each other: Thus,

i) G 0 continuity (zero order geometric continuity):


q1 (u ∗ ) = q2 (u ∗ )
(same as parametric continuity)
ii) G continuity (1 order geometric continuity):
(1) st

q1 (u ∗ ) = q2 (u ∗ ),
q1′(u ∗ ) ∝ q2′ (u ∗ )
iii) G (2) continuity (2nd order geometric continuity):
q1 (u ∗ ) = q2 (u ∗ ),
q1′(u ∗ ) ∝ q2′ (u ∗ ),
q1′′(u ∗ ) ∝ q2′′(u ∗ )

To understand the difference between these two types of continuity conditions, consider the
two-section splines on 3 control points, joined at the point P1 . The effect of imposing geometric
continuity is to pull the curve towards the slope of larger magnitude:

q2
q2

P0 P2 P0 P1 P2
P1
q1 q1

Parametric continuity at P1 Geometric continuity at P1

Specifying splines
A spline can be specified by
i) setting boundary conditions on each section, and/or
ii) incorporating boundary conditions into a characteristic matrix and specifying this
matrix, or
152

iii) specifying a set of basis functions (blending functions) determining geometric


constraints
For example, for a cubic spline each section has parametric equation (for the x-coordinate)
x (u ) = a x u 3 + bx u 2 + cx u + d x , 0 ≤ u ≤ 1 (8.16)
where the a x ,..., d x are constants to be determined.
Typical boundary conditions to impose are:-
x (0) = α 
x (1) = β 
 = fixed/chosen (8.17)
x′(0) = γ 
x′(1) = δ 
These conditions are sufficient to find a x ,..., d x .
To find the spline matrix for this section, write (8.16) as
ax 
b 
x (u ) = u 3 u 2 u 1  x  ≡ U ⋅ C (8.18)
 cx 
 
d x 
where U is the indicated row-matrix of powers of u and C the column-matrix of the
coefficients in the cubic form.
From the boundary conditions (8.16) and (8.17) we can write
α   0 0 0 1   a x 
 β  1 1 1 1   b 
 =  x (8.19)
 γ   0 0 1 0  cx 
    
 δ   3 2 1 0  d x 

−1
 a x   0 0 0 1  α 
 b  1 1 1 1   β 
or, C =  x=    (8.20)
 cx  0 0 1 0  γ 
     
3 2 1 0 δ 
 d x  144 2443 {
M spline M geom

This form is also true for more general boundary conditions. We call M spline the basis matrix.
Finally, we can write the spline section (8.18) as
x (u ) = U ⋅ M spline ⋅ M geom
= α BF0 (u ) + β BF1 ( u ) + γ BF2 ( u ) + δ BF3 (u ) (by expansion) (8.21)
3
= ∑ g k ⋅ BFk ( u )
k =0

where g 0 = α , g1 = β , g 2 = γ , g3 = δ = constants and the forms BF0 (u ),...BF3 (u ) are


polynomial blending functions.
153

8.10 Cubic spline interpolating methods


Cubic interpolation splines have proved to be an efficient, accurate and flexible tool to model
object shapes as well as to model animation paths. Here, given a set of n+1 control points
p k = ( xk , yk , zk ); k = 01, 2,..., n (8.22)
we wish to fit piece-wise cubic polynomial sections on each pair of “intervals”
[p 0 , p1 ],[p1 , p 2 ] , ... [p n −1 , p n ] , where the form of each piece is given by
x (u ) = a x u 3 + bx u 2 + cx u + d x 

y ( u ) = a y u 3 + by u 2 + c y u + d y  , 0 ≤ u ≤ 1. (8.23)

z (u ) = a z u 3 + bz u 2 + cz u + d z 

p k+1 p n −1
p1 p2 p k −1 pk pn
p0
PkR (u )
PkL (u )

Depending on the form of the boundary conditions chosen, different cubic splines can be
obtained:-

Natural cubic splines


Here at each internal control point ( p k , k = 1, 2,...n − 1) , we set for successive LH ( PkL (u ) )
and RH ( PkR (u ) ) cubic pieces
 PkL (p k ) = PkR (p k ), continuity in curve

 PkL′ (p k ) = PkR′ (p k ), continuity in 1st deriv (8.24)
 P′′ (p ) = P′′ (p ), continuity in 2nd deriv
 kL k kR k

These give (4n-4) equations, but we require 4n conditions to find all the a x ,..., d x in (8.23 ).
We can fit the extreme endpoints
P1L (p 0 ) = a given value, Pn −1,R (p n ) = a given value (2 more equations) (8.25)
and set endpoint derivatives = 0
P1′L (p0 ) = 0, Pn′−1,R (p n ) = 0. (2 more equations) (8.26)
We thus have 4n-4 + 2 + 2 = 4n equations to find all the constants.
In a different approach, we can fit two more dummy points p -1 , p n+1 instead of setting
derivatives.

However, the natural spline obtained here, has a serious drawback in CG, in that, whenever a
local change is made (any control point is altered), the entire curve is affected; so other spline
types are generally preferred.

Hermite interpolation
Here, for each spline section (e.g. cubic piece) we specify
• function values at each control point
154

• tangents/derivatives at each control point


This allows us to make adjustments without affecting the other sections of the overall spline.
For more details see H&B p. 426.

Cardinal splines
Quite often, it’s not easy to supply or obtain slope information at the control points, so that
Hermite interpolation becomes infeasible. Then cardinal splines are useful, with the following
defining properties:
• This is a piece-wise interpolatory cubic with tangent values specified at the endpoints
of each curve section, where these values are not given but calculated from
neighbouring position values:
P( u )

pk+1
pk
p k −1 pk +2

• The cardinal spline section P(u ) is completely specified with 4 control points by:-
P(0) = pk 
P(1) = pk +1 

1 
P′(0) = (1 − t ) ( pk +1 − pk −1 )  (8.27)
2 
1 
P′(1) = (1 − t ) ( p k+2 − pk ) 
2 
i.e. the slopes are chosen proportional to the slopes of the chords pk-1pk +1
and pk pk +2 . The parameter t is a tension parameter which controls how
loosely or tightly the curve section fits the control points:

slope pk+1
pk P( u )
slope

p k −1 pk +2 t < 0: t > 0:
looser fit tighter fit

When t = 0 the class of curves are called Catmull-Rom splines or Overhauser splines.

As before, we can convert the boundary conditions to matrix form and write the spline section
as
 pk-1 
p 
P(u ) = u 3 u 2 u 1 ⋅ M C ⋅  k  (8.28)
 pk +1 
 
pk +2 
where the cardinal spline matrix is
155

− s 2 − s s − 2 s 
 2 s s − 3 3 − 2s − s 
MC =   ; s ≡ 1 (1 − t ). (8.29)
− s 0 s 0 2
 
0 1 0 0

Further, expanding (8.28) into polynomial form gives

P(u ) = pk-1  − su 3 + 2 su 2 − su  + p k (2 − s )u 3 + ( s − 3)u 2 + 1


+ pk +1 ( s − 2)u 3 + (3 − 2 s )u 2 + su )  + pk+2  su 3 − su 2  (8.30)
≡ pk-1CAR0 ( u ) + pk CAR1 ( u ) + pk +1CAR2 ( u ) + pk +2CAR3 ( u )
in which the functions in square brackets are in order, CAR0 ( u ) ,..., CAR3 ( u ) , the cardinal
spline blending (or basis) functions.

Examples of what they look like graphically are depicted below for t = 0 or s = 0.5:

1 1

CAR1 (u )

CAR0 (u )
0 u 0
u
1 1

1 1
CAR2 (u )

CAR3 (u )
0 u 0 u
1 1

Kochanek-Bartels (KB) splines


These are extensions of cardinal splines by adding two more control parameters b and c in
addition to the tension parameter t. The additional parameters allow us to relax slope
(derivative) continuity across section boundaries (via c) and control how much the curve bends
at each section endpoint (via b). Such splines are useful in modelling animation paths. See
H&B p. 432 for more detail.

8.11 Bezier spline curves


Bezier curves were developed by the French engineer Pierre Bezier for use in the design of
Renault cars. They are easy to use and have many advantageous properties making them
popular in CAD and graphics packages.

A Bezier curve section can be fitted to any number of control points and can be specified by
156

• boundary conditions
• a characteristic matrix, or
• blending functions

In terms of the last option, given n+1 control points


p k = ( xk , yk , zk ); k = 01, 2,..., n (8.31)
and the set of Bernstein polynomials, BEZ k ,n (u ), k = 0,1,2,...n then a Bezier curve is
specified as the path between p0 and p n given by the point-position vector
P(u ) = p0 BEZ 0,n (u ) + p1BEZ1,n (u ) + p 2 BEZ 2,n (u ) + ... + p n BEZ n ,n (u )
n (8.32)
= ∑ pk BEZ k ,n (u ); 0 ≤ u ≤ 1.
k =0

The Bernstein polynomials are defined by


n!
BEZ k ,n (u ) = C (n, k )u k (1 − u )n − k , with C ( n, k ) = . (8.33)
k ! ( n − k )!
They may also be generated recursively by
BEZ k ,n (u ) = (1 − u ) BEZ k ,n −1 (u ) + uBEZ k −1,n −1 (u ); n > k ≥ 1 (8.34)

with BEZ k ,k (u ) = u k , BEZ 0,k (u ) = (1 − u )k (8.35)

Equation (8.32) expands into 3 parametric equations


n

x(u ) = ∑ xk BEZ k ,n (u ) 
k =0

n

y (u ) = ∑ yk BEZ k ,n (u )  (8.36)
k =0 
n 
z(u ) = ∑ zk BEZ k ,n (u ) 
k =0 

As a rule, a Bezier curve = a polynomial of degree 1 less than the number of control
points.
For example,
3 control points ⇒ Bezier curve = a parabola
4 control points ⇒ Bezier curve = a cubic

Examples of 2D Bezier curves ( zk = 0 ) are indicated by the solid lines below:


157

p1 p2
p3
p1

p2
p0 p0 p3
p2 p4
p1 p0

p0 p3 p2
p1

Important properties of Bezier curves


i) A Bezier curve always passes through both the end control points, i.e.
P(0) = p 0 , P(1) = p n (8.37)
ii) Its endpoint derivatives are
P′(0) = − np0 + np1 
 (8.38)
P′(1) = − npn-1 + np n 

⇒ slope at beginning same as of line joining 1st two control points


slope at end same as of line joining last two control points
iii) Its parametric 2nd derivatives are
P′′(0) = n (n − 1) [ ( p 2 - p1 ) − (p1 - p 0 )] 
 (8.39)
P′′(1) = n (n − 1) [ ( p n-2 - p n-1 ) − (p n-1 - p n )] 
iv) A Bezier curve lies within the convex hull (convex polygon boundary) of its
control points. To see this we note that BEZ k ,n (u ), k = 0,1,2,...n are all non-
negative, and it can be shown that
n

∑ BEZ
k =0
k ,n (u ) = 1 ∀n and 0 ≤ u ≤ 1. (8.40)

Then any position on the curve is a weighted sum of its control points. This
property ensures that the polynomial tends to smoothly follow the control points
without erratic oscillations.

Design techniques using Bezier curves


i) Closed Bezier curves may be generated by making the first and last control points
equal, as in the 1st figure below:
158

p3 p1 = p 2 p3

p2

p0
p1 p4 p4
p0 = p5

or, a curve may pulled towards a point by choosing a double point there (2nd figure
above).

ii) When there is a large number of control points to be fitted, it’s unwise to use a
single high-degree Bezier curve, since this will introduce too many (unwanted)
oscillations, apart from the computational effort required to solve the polynomial
coefficients. Rather, we fit lower order piece-wise Bezier curves and join them
appropriately to obtain the required order of overall continuity.
iii) C 0 continuity automatically follows at the common control endpoint since both
Bezier sections pass through that point:

p1

C1
p2
p0 p′0
C2
p′3

p′1 p′2

For example, in the above one Bezier section ( C1 ) with control points p 0 ,p1 , p 2
joins another ( C2 ) with control points p′0 , p′1 ,p′2 ,p′3 at p 2 = p′0 giving continuity
at this common endpoint.

iv) To obtain C (1) continuity we select the 2nd point of the second section ( C2 ) in such
a way that the p1 − p 2 slope is the same as the p′0 − p1′ slope by setting:
p′0 = p 2 
 (8.41)
p1′ = p 2 + (p 2 − p1 ) 
159

This process must of course, be implemented on all the common control points, as
well as the composite endpoints, if required. Overall C (1) continuity also requires
that both sections must each have the same number of control points.

v) By similar means we can obtain C ( 2) continuity, but the latter is too restrictive
especially on cubic Bezier sections, so it is not usually administered.

Using cubic Bezier curves


Since cubic Bezier curves are very commonly employed in design packages, we consider them
in some detail. First note that each cubic Bezier section is generated by 4 control points.
Further setting n=3 into (8.33) gives the 4 four blending functions:

BEZ 0,3 (u ) = (1 − u ) 
3


BEZ1,3 (u ) = 3u (1 − u ) 
2

 (8.42)
BEZ 2,3 (u ) = 3u 2 (1 − u ) 

BEZ 3,3 (u ) = u 3 

These look like:

1 1
BEZ 0,3 (u ) BEZ1,3 (u )

0 u 0
u
1 1

1 1
BEZ 2,3 (u )
BEZ 3,3 (u )

0 u 0 u
1 1

Notice that
• at u = 0 only BEZ 0,3 (0) ≠ 0 , whilst all the others = 0.
• at u = 1 only BEZ 3,3 (1) ≠ 0 , whilst all the others = 0.
• consequently, from (8.37) P(0) = p 0 , P(1) = p3 i.e. Bezier curve passes thro’
endpoints

At the endpoints the slopes are the parametric derivatives


P′(0) = 3(p1 - p 0 ); P′(1) = 3( p 3 - p 2 ) (8.43)
160

and there the parametric 2nd derivatives are


P′′(0) = 6(p 0 - 2p1 + p 2 ); P′′(1) = 6(p1 - 2p 2 + p 3 ) (8.44)

Also, expanding the Bezier polynomial (8.32) with (8.42) gives

 −1 3 −3 1  p0 
 3 −6 3 0   p1 
P(u ) = u 3 u2 u 1    (8.45)
 −3 3 0 0  p 2 
  
1 0 0 0  p 3 
M BEZ

OpenGL sample code for a cubic Bezier curve (H&B p. 435)


Here 4 control points are used and 1000 pixel positions are plotted and a pixel width of 4 is
used. The function binomialCoeffs(.) computes the binomial coefficients and
computeBezPt(.) finds the points along the curved path. The function bezier(.) then plots
these pixel positions using OpenGL.

//BezierCurve.cpp
//---------------
//H&B p. 435
//Computes and draws the Bezier curve thro' 4 control pts in WCs
#include <windows.h>
#include <GL/glut.h>
#include <stdlib.h>
#include <math.h>

/* Set initial size of the display window. */


GLsizei winWidth = 600, winHeight = 600;
/* Set size of world-coordinate clipping window. */
GLfloat xwcMin = -50.0, xwcMax = 50.0;
GLfloat ywcMin = -50.0, ywcMax = 50.0;

class wcPt3D {
public:
GLfloat x, y, z;
};

void init (void)


{
/* Set color of display window to white. */
glClearColor (1.0, 1.0, 1.0, 0.0);
}

void plotPoint (wcPt3D bezCurvePt)


{
glBegin (GL_POINTS);
glVertex2f (bezCurvePt.x, bezCurvePt.y);
glEnd ( );
}

/* Compute binomial coefficients C for given value of n. */


void binomialCoeffs (GLint n, GLint * C)
{
161

GLint k, j;
for (k = 0; k <= n; k++) {
/* Compute n!/(k!(n - k)!). */
C [k] = 1;
for (j = n; j >= k + 1; j--)
C [k] *= j;
for (j = n - k; j >= 2; j--)
C [k] /= j;
}
}

void computeBezPt (GLfloat u, wcPt3D * bezPt, GLint nCtrlPts,


wcPt3D * ctrlPts, GLint * C)
{
GLint k, n = nCtrlPts - 1;
GLfloat bezBlendFcn;
bezPt->x = bezPt->y = bezPt->z = 0.0;

/* Compute blending functions and blend control points. */


for (k = 0; k < nCtrlPts; k++) {
bezBlendFcn = C [k] * pow (u, k) * pow (1 - u, n - k);
bezPt->x += ctrlPts [k].x * bezBlendFcn;
bezPt->y += ctrlPts [k].y * bezBlendFcn;
bezPt->z += ctrlPts [k].z * bezBlendFcn;
}
}

void bezier (wcPt3D * ctrlPts, GLint nCtrlPts, GLint nBezCurvePts)


{
wcPt3D bezCurvePt;
GLfloat u;
GLint *C, k;

/* Allocate space for binomial coefficients */


C = new GLint [nCtrlPts];

binomialCoeffs (nCtrlPts - 1, C);


for (k = 0; k <= nBezCurvePts; k++) {
u = GLfloat (k) / GLfloat (nBezCurvePts);
computeBezPt (u, &bezCurvePt, nCtrlPts, ctrlPts, C);
plotPoint (bezCurvePt);
}
delete [ ] C;
}

void displayFcn (void)


{
/* Set example number of control points and number of
* curve positions to be plotted along the Bezier curve.
*/
GLint nCtrlPts = 4, nBezCurvePts = 1000;

wcPt3D ctrlPts [4] = { {-40.0, -40.0, 0.0}, {-10.0, 200.0, 0.0},


{10.0, -200.0, 0.0}, {40.0, 40.0, 0.0} };

glClear (GL_COLOR_BUFFER_BIT); // Clear display window.

glPointSize (4);
glColor3f (1.0, 0.0, 0.0); // Set point color to red.
162

bezier (ctrlPts, nCtrlPts, nBezCurvePts);


glFlush ( );
}
void winReshapeFcn (GLint newWidth, GLint newHeight)
{
/* Maintain an aspect ratio of 1.0. */
glViewport (0, 0, newHeight, newHeight);
glMatrixMode (GL_PROJECTION);
glLoadIdentity ( );
gluOrtho2D (xwcMin, xwcMax, ywcMin, ywcMax);
glClear (GL_COLOR_BUFFER_BIT);
}
int main (int argc, char** argv)
{
glutInit (&argc, argv);
glutInitDisplayMode (GLUT_SINGLE | GLUT_RGB);
glutInitWindowPosition (50, 50);
glutInitWindowSize (winWidth, winHeight);
glutCreateWindow ("Bezier Curve");
init ( );
glutDisplayFunc (displayFcn);
glutReshapeFunc (winReshapeFcn);
glutMainLoop ( );
}

Code outputs:

For a Java program that allows the user to insert the control points with mouse clicks see
Ammeraal’s Bezier.java.

8.12 Bezier surfaces


Bezier surfaces may be constructed with two orthogonal sets of Bezier curves: we supply an
input mesh of control ( m + 1) × (n + 1) points p jk and generate points on the surface by
parametric vector function (Cartesian product of Bezier blending functions)
163

m n
P(u, v ) = ∑∑ p jk BEZ j ,m ( v ) ⋅ BEZ k ,n (u ) (8.46)
j =0 k = 0

An outline of such a surface is shown below.

A cubic Bezier surface

8.13 B-spline curves


B-splines are widely used in graphics packages and in contrast to Bezier curves:
• can set the degree of the polynomial independently of the number of control points
• allow local control over the shape of the spline
• are more complex mathematically than Bezier curves
B-spline surfaces are generated using a procedure similar to that for Bezier surfaces.
For a detailed treatment see H&B pp. 442-451.

8.14 Beta-spline curves


These are generalizations of B-splines and are based on imposing geometric continuity
conditions on the 1st and 2nd parametric derivatives. See H&B p. 452-453.

8.15 Rational splines


These are obtained by forming a quotient of one spline polynomial with another (rational
form). They offer more generality (and flexibility) than ordinary splines, so that for example, with
one family of rational splines we can generate all conic sections (circles, ellipses, parabolas,
hyperbolas) exactly without need for separate functions for each. See H&B p. 454-456.

8.16 OpenGL spline approximation functions


OpenGL and GLU have Bezier and B-spline functions together with functions for trimming
such curves and surfaces (clipping curves, making cuts and holes in surfaces etc). Some of
these are:

1. glMap1*(GL_MAP1_VERTEX_3, uMin, uMax, stride, nPts, *ctrlPts;


glEnable(GL_MAP1_VERTEX_3);
and the above is disabled with
164

glDisable(GL_MAP1_VERTEX_3);
Here
• use suffix (*) f or d for type of values, uMin/uMax usually 0/1,
• nPts = number of control pts (> 0) in coordinate position array ctrlPts
• stride = integer offset = number of data values between beginning of one coordinate
position in ctrlPts and the beginning of the next. For just 3D coordinate positions set
stride=3. For 4D homogeneous coords set stride=4 and set symbolic const to
GL_MAP1_VERTEX_4.

2. glEvalCoord1*(uValue);
- evaluates positions along the spline path where uValue is a value in range uMin...uMax.
This function maps uValue to a u in the range 0...1.0 by means of the mapping
u − umin
u = value (8.47)
umax − umin

and then computes points on the curve using (8.32). It also generates a glVertex3 function
as u is processed. Repeated calls of glEvalCoord1*(.) are required to produce points along
the curve, which can be joined together by straight-line segments, approximating the curve.
Non-uniformly spaced points can be generated with this function as well.

The following code segment shows how to employ this process (for a full program see
OGLBezierCurve.cpp):

GLfloat ctrlPts [4][3] = { {-40.0, 40.0, 0.0}, {-10.0, 200.0, 0.0},


{10.0, -200.0, 0.0}, {40.0, 40.0, 0.0} };
glMap1f (GL_MAP1_VERTEX_3, 0.0, 1.0, 3, 4, *ctrlPts);
glEnable (GL_MAP1_VERTEX_3);
GLint k;
glColor3f (0.0, 0.0, 1.0); // Set line color to blue.
//Generate Bezier “curve”
//-------------------------------
glBegin (GL_LINE_STRIP); .
for (k = 0; k <= 50; k++)
glEvalCoord1f (GLfloat (k) / 50.0);
glEnd ( );

glColor (1.0, 0.0, 0.0); // Set point color to red.


glPointSize (5.0); // Set point size to 5.0.
glBegin (GL_POINTS); // Plot control points.
for (k = 0; k < 4; k++);
glVertex3fv (&ctrlPts [k][0]);
glEnd ( );

3. glMapGrid1*(n, u1, u2);


glEvalMesh1(mode, n1, n2);
Can be used to generate uniformly spaced points along a Bezier curve, with n =
number of equally spaced u-values from u1 to u2 and n1 and n2 are their
corresponding n-values.
mode = GL_POINT (for display of just points on curve), GL_LINE (str line segments)

With mode GL_LINE, the above 2 statements are equivalent to


glBegin (GL_LINE_STRIP); // Generate Bezier "curve".
for (k = 0; k <= 50; k++)
165

glEvalCoord1f (GLfloat (k) / 50.0);


glEnd ( );
in the code of 2 above, i.e we can replace this block in the code by
glMapGrid1f(50, 0.0, 1.0);
glEvalMesh1(GL_LINE, 0, 50);

For more options with glMap1*(.) see H&B p. 464.

4. glMap2*(GL_MAP2_VERTEX_3, uMin, uMax, uStride, nuPts,


vMin, vMax, vStride, nvPts, *ctrlPts);
glEnable(GL_MAP2_VERTEX_3);
Will generate a Bezier surface with obviously extended parameters to glMap1*(.)
Array ctrlPts is of size nuPts × nvPts
uStride, vStride = integer offsets similar to stride (for glMap1*)

5. glEvalCoord2*(uValue, vValue); //or


glEvalCoord2*v(uvArray); //for array param
Finds uValue and vValue along Bezier curves of surface mesh and maps to intervals
0...1 and 0...1 with
u − umin v −v
u = value ; v = value min (8.48)
umax − umin vmax − vmin

Then repeated calls to glEvalCoord2*(.) will display the Bezier surface with
glVertex3*(.) being generated in the process.

A sample code segment for generating a cubic Bezier surface is (from H&B):

GLfloat ctrlPts [4][4][3] = {


{ {-1.5, -1.5, 4.0}, {-0.5, -1.5, 2.0},
{-0.5, -1.5, -1.0}, { 1.5, -1.5, 2.0} },
{ {-1.5, -0.5, 1.0}, {-0.5, -0.5, 3.0},
{ 0.5, -0.5, 0.0}, { 1.5, -0.5, -1.0} },
{ {-1.5, 0.5, 4.0}, {-0.5, 0.5, 0.0},
{ 0.5, 0.5, 3.0}, { 1.5, 0.5, 4.0} },
{ {-1.5, 1.5, -2.0}, {-0.5, 1.5, -2.0},
{ 0.5, 1.5, 0.0}, { 1.5, 1.5, -1.0} }
};

glMap2f (GL_MAP2_VERTEX_3, 0.0, 1.0, 3, 4,


0.0, 1.0, 12, 4, &ctrlPts[0][0][0]);
glEnable (GL_MAP2_VERTEX_3);
GLint k, j;
glColor3f (0.0, 0.0, 1.0);
for (k = 0; k <= 8; k++)
{
glBegin (GL_LINE_STRIP); // Generate Bezier surface lines.
for (j = 0; j <= 40; j++)
glEvalCoord2f (GLfloat (j) / 40.0, GLfloat (k) / 8.0);
glEnd ( );
glBegin (GL_LINE_STRIP);
for (j = 0; j <= 40; j++)
glEvalCoord2f (GLfloat (k) / 8.0, GLfloat (j) / 40.0);
glEnd ( );
}
166

6. glMapGrid2*(nu, u1, u2, nv, v1, v2);


glEvalMesh2(mode, nu1, nu2, nv1, nv2);
Can be used to generate evenly spaced parameter values. We can employ these to
replace the Bezier surface generating block using glEvalCoord2f(.) in the above code
(see H&B p. 467).

7. GLU B-spline functions for curves and surfaces are also available. See H&B p. 467 –
473.

8.17 Return of the teapot


The program GlutTeapot.cpp (from Georgia Inst. Tech.) shows how patch data for a
teapot may be defined and employed. It also includes the glutKeyboardFunc(.) for
adding key controls to rotate the object and OpenGL lighting functions as well (to be
studied later). Three views generated by the program are:-

8.18 Sweep representations


Objects that have translational, rotational or other such symmetries may be constructed by a
sweep representation process. For example, starting on the LHS below with a closed spline
(determined by 4 control points) we translate it along the line perpendicular to its cross-section
area, and draw connecting lines from the first position to the second (in the sweep direction).
Repeating the process many times generates the cylinder on the RHS.

initial closed spline sweep generated cylinder

control points
167

8.19 Constructive solid geometry (CSG) methods


Here a composite object is constructed by putting together other more basic ones using
operations akin to the basic set operations: union, intersection and difference. See H&B p.
474-476.

8.20 Octrees
In this approach, a hiearchical tree structure (octree) is used to represent a solid object. See
H&B p. 476-479.

8.21 Binary space-partitioning (BSP) trees


Similar to octrees. See H&B p. 479.

8.22 Fractal-geometry methods


Thus far, all object descriptions were given by equations defining their shapes (Euclidean
geometry methods). Such methods are accuarte for modelling regular shapes with smooth
surfaces. Fractal geometry methods are based on procedures for generating points (or lines)
rather than on shape equations. They are used to model realistic looking irregular shapes
such as shorelines, trees, leaves, fluid particles, clouds, random textures etc. The term fractal
is derived from the mathematical notion of a “fractional dimension” object.

A fractal object has two important features:


i) infinite detail at any point – e.g. repeated zooming-in provides as much detail as
before the zoom, unlike Euclidean shapes which tend to smoothen out.
ii) self-similarity between object parts and the overall features of the object.

The amount of detail variation in the object is described by a number, its fractal dimension,
not necessarily an integer (can have fractional dimension).

Fractal generating procedures


A fractal object is generated by repeated application of a specified transformation to points
within a region of space. If p 0 = ( x0 , y0 , z0 ) is a selected initial point, and F (.) a rule or
transformation function, then successive levels of detail are given by
p1 = F ( p0 ), p 2 = F (p1 ), p 3 = F ( p 2 ),... (8.49)
The function F (.) may be applied on a point set or on an initial set of primitives e.g. lines.
Although more and more detail is provided as F (.) is applied, the true fractal is reached after
an infinite number of applications. In practice however, a finite number of applications is
sufficient to give the required amount of detail. In any case, the level of detail is limited by the
pixel resolution of the display device. Nevertheless, repeated zooming-in on a portion of the
object will reveal more and more detail to within the pixel resolution.

Classification of fractals
a. self-similar fractals – here parts of the whole are scaled down versions of the
whole. We can use the same scale factor s for each sub-part or different ones.
If randomly different s-values are used then the fractal is said to be
statistically self-similar. For example, the latter is used to model trees,
shrubs etc.
b. self-affine fractals – here parts are formed with different scaling factors
sx , s y , sz in each of the coordinate directions. Again, random variations of
168

these result in statistically self-affine fractals – used to model terrain, water,


clouds etc
c. Invariant fractal sets – formed with nonlinear transformations including
self-squaring (gives e.g. the Mandelbrot set) in the complex plane, self-
inverse transformations and so on.

Fractal dimension
Detail variation in a fractal is described by a number D = the fractal dimension = a measure of
roughness or fragmentation.

Sometimes we can set a D-value and determine procedures for generating a fractal. At other
times we can only find D from properties of the constructed object.

For self-similar fractals we can obtain expressions for the fractal dimension in terms of the
scale factor s by analogy with Euclidean subdivision:
For example, choose s = ½ as a scale factor and perform repeated subdivisions of the following
basic shapes
After 1 subdivision n=2 parts
Euclidean dim DE = 1 Scale factor s = 1/ n

1 unit at start

Satisfies ns1 =1 or nsDE = 1

After 1 subdivision n=4 parts


Euclidean dim DE = 2 Scale factor s = 1/ n1/ 2

A′ = A / n
A

Satisfies ns2 =1 or nsDE = 1

After 1 subdivision n=8 parts


Euclidean dim DE = 3 Scale factor s = 1/ n1/ 3

V′ =V /n
V

Satisfies ns3 =1 or nsDE = 1

Now, in analogy with the dimension for Euclidean objects, for a self-similar object we take the
fractal dimension D to satisfy
ns D = 1 (8.50)
169

ln n
or D= (8.51)
1
ln
s

For self-similar fractals constructed with different scaling factors D is obtained from the implicit
relation
n

∑s
k =1
D
k = 1; sk = scale factor ~ kth sub-part (8.52)

In general it is much more difficult to determine D. Approximate values of it may be obtained by


“topological covering” of complicated shapes by simpler ones.

Note also, that the fractal dimension is always > the Euclidean dimension. For instance:
• Euclidean curves have DE = 1 , but a fractal curve generally has dimension
1 < D ≤ 2 , with D > 2 also possible. The closer D is to 1 the smoother it is. Other
values of D give more interesting possibilities: D = 2 gives a Peano curve that
completely fills a finite 2D region. For 2 < D < 3 the curve self-intersects, infinite
number of times – can be used to model natural boundaries e.g. shorelines.
• Fractal surfaces generally have dimension in the range 2 < D ≤ 3 with D > 3 also
possible, giving overlapping coverage – useful for terrain, clouds, water modelling.
• Fractal solids generally have dimension 3 < D ≤ 4 with D > 4 also possible, giving
self-overlapping solids – useful to depict vapour density, temperature in a region etc.

Geometric construction of self-similar fractals


For these deterministic (non-random) shapes we start with a given basic shape, the initiator,
and then we replace each subpart of it with another shape or pattern called a generator, to
give a resulting shape, completing one iteration in the process. We may repeat the process
with the same generator, resulting in a 2nd iteration of the basic shape and so on.
Example:
Each straight line segment in the initiator is replaced by 4 equal-length line segments
comprising a generator and a scaling factor of s = 1/3 is applied to each segment in the
generator resulting in its overall length of 4/3.

initiator generator

snow-flake or Koch curve

ln 4
The fractal dimension of the snow-flake is then D = 1.2619 . Further iterations produce
ln 3
more complicated shapes – see H&B p. 485.

Geometric construction of statistically self-similar fractals


170

In the above we can use random choices of generators or calculate coordinate positions
randomly, so that the result is an irregular and more realistic shape – used to model trees,
plants etc.

Random midpoint-displacement methods


These are faster than the above, but appear less realistic. Here starting with a line for example,
we displace its midpoint to
1
ymid = [ y (a ) + y (b)] + r (8.53)
2
where r is a random offset chosen from a Gaussian distribution with mean = 0 and variance
2(2 − D )
∝ b−a with D > 1 the fractal dimension. Other choices are also used, but r basically
controls the “roughness” of a surface or curve.

Y Y
y ( b) y ( b)
y(a ) y(a )
ymid
shape after
X X
a b a a+b b several
iterations
2

For generating terrain features we can start with a ground plane and randomly elevate the
midpoint, and continue in this way for each sub-part until several iterations are completed.
Such models can be used to depict small variations in a surface or even large variations
(mountainous terrain). See H&B p. 492-4.

Y Y
Z
e
a b
Y
ground h m f
plane

X d g c X X

Here the midpoint elevation zm may be computed by using the corner elevations plus a
random offset like:
1
zm = ( za + zb + zc + zd ) + rm (8.54)
4
Again, after several iterations a granular topography results.

Self-squaring fractals
171

This method generates fractals by repeated application of a mathematical transformation on a


region of the complex plane. For example the set of points z = x + iy , i = −1 may be points
in a circle, and depending on a starting point, say, z0 = x0 + iy0 the transformation may be
repeatedly applied on it so that transformed points may either:-
• diverge to infinity (or 'escape')
• converge towards some finite point, called an attractor
• or, remain confined to the boundary (i.e. stay bounded)

Three interesting transformation functions F ( z ) and their fractals are:


i) The Julia set generated by
F ( z ) = z 2 − µ = ( x 2 − y 2 − a ) + i ( 2 xy − b ) ; 
 (8.55)
where z = x + iy , µ = a + ib = a constant 

ii) The Mandelbrot set also based on the above function.

iii) The self-squaring fractals based on


F ( z ) = λ z (1 − z ); z = x + iy , λ = a constant (8.56)

The Julia set fractal


Here we apply the transformation (8.55) on a rectangular region R of the complex plane
contained within a “large” circle with centre (0,0) and radius r (an “effective infinity”) as follows:

Y
escaped sequence
z4
trapped sequence
z3 ′
z2 z200
COMPLEX PLANE
z3′
z1 z2′ z4′
−r X
z0 z1′ r
r =10, may be OK
z0′ some
rectangle like:
−2 ≤ x ≤ 2
−2 ≤ y ≤ 2

Choose a starting point z0 = x0 + iy0 within the rectangle and some complex constant
µ = a + ib and then generate the iterated function sequence (IFS) of (8.55) by,
z1 = z02 − µ 

z1 = z02 − µ 
. (8.57)
.... 
zn = zn −1 − µ 
2
172

Now if for some n, we find zn > r we say the point z0 has escaped. The set of all starting
points z0 (e.g. z0′ shown above) which do not escape is commonly called the Julia set of the
complex polynomial F ( z ) in (8.55). Actually, the exact Julia set is the boundary of this
collection of points, but we’ll speak loosely of it.

Coding it:
In our program Julia.cpp we have used a rectangle of size [-2,2] × [2,2] and a radius of just 4,
with a=1.0 and b=0.0 in WCs. The starting points are chosen by sweeping through the window-
coordinate range in pixels values ( x p , y p ) and transforming to the WC ( x, y ) range above.
The transformation can follow from our previous theory or we can apply first principles based
on:

xmin x0 xmax
real WC X axis
-2 0 2

pixel Xp axis
0 xp maxxPix

Then a straight forward linear transformation sending xp→ x0 is given by the proportionate
lengths expression:
xp-0 x0-xmin (xmax-xmin)xp
= ⇒ x0 = + xmin (8.58)
maxxPix-0 xmax − xmin maxxPix

Similarly, for the y-values:


yp-0 y0-ymin (ymax-ymin)yp
= ⇒ y0 = + ymin (8.59)
maxyPix-0 ym ax − ymin maxyPix

In our code we have also set up a 15-colour-value array of RGB values, which we use to colour
the escaping pixel.
Thus, for each pixel coordinate (xp,yp) in the pixel coordinate range, we map to the complex
value (x,y) in WCs, applying the transformation repeatedly, so that if any |(xn,yn)| > r, we light up
the escaping pixel (xp,xp). When done for the pixel coordinates in the window range, the non-lit
pixels (in the background colour) comprises the Julia set. Symmetry is used to avoid
duplicating calculations.
The output from Julia.cpp looks like:
173

The Mandelbrot set


Consider again the complex polynomial (8.55) together with a rectangular region R of the
complex plane contained in a circle of some radius r as above. We now regard µ = a + ib as
a variable (test point) and starting with the origin we generate the IFS,

zn = zn2−1 − µ ; n = 0,1, 2,...; z0 = (0,0) = fixed (8.60)

Then the set of all µ = a + ib for which the origin escapes (i.e. zn > r for some n) is called
the Mandelbrot set.

This program may be coded by using real arithmetic, as in Julia.cpp. Do it as an exercise. Note
that H&B construct and employ a complex number class with overloaded operators for the
usual complex operations. You must avoid their approach and use ours in this exercise.

The output from your code should look like:

Self-squaring fractals
An example of generating fractals with (8.56) above is H&B’s SelfSquareFractal.cpp which
first obtains the inverse of this transformation. See H&B p. 496-499.
174

Other fractal methods


Many other forms of fractals can be generated to model various objects in nature, which give
surprisingly realistic renditions. Included amongst these are methods based on shape
grammars and other procedural methods. See H&B p. 506-510.

8.23 Particle system modelling


In many applications, we require to construct scenes which contain objects that have no
definite geometric shape: fluids, gases, spluttering particles (from an explosion). These are
typically modelled by means of a collection of elementary particle-objects which may move
collectively (under some physical laws, or prescribed kinematics). Such usage is common in
animation scenes. No time for this.

8.24 Physically based modelling


Here objects follow physical laws for motion or shape transformations. No time for this.

8.25 Data set visualization


Typically, required in processing and visualizing the results of scientific experiments, economic
modelling, engineering and scientific simulations etc. Special graphics techniques required may
be modified from existing graphics theory already studied. No time for this.
175

CHAPTER NINE: VISIBLE SURFACE DETECTION

How to determine what is visible in a scene from a particular viewing position is of concern in
this chapter. For instance, which faces of an object can we see from a particular vantage
position, and what should appear obscured from this position is answered here. The techniques
involved are variously called visible surface detection methods or hidden surface removal
methods, although they may not mean exactly the same thing.
Typically, such methods work either on the object descriptions (object-space methods) or on
the pixel data (image-space methods), although a combination of both have also been
devised.

9.1 Back-face detection


Object surfaces that are orientated away from the viewer are called back faces. Let a polygon
surface be given by the functional equation
φ ( x, y, z ) ≡ Ax + By + Cz + D = 0 (9.1)
where A,B,C,D are the surface plane parameters. Then an outward normal to it is
N = ∇φ = ( A, B, C )
Thus, if the view direction vector is Vview in view coordinates, the face (9.1) is a back (or
hidden) face if it satisfies
Vview ⋅ N ≥ 0 (9.2)

If the Vview is taken as a vector in the − zv direction with unit vector zˆ v in a right-handed
system, and object descriptions have been transformed to view-projection coordinates then the
face (9.1) is a back face if
zˆ v ⋅ N = C ≤ 0 . (9.3)
This object-space test may be used to eliminate (or cull) several but not all hidden surfaces or
parts of them, since for example some may only be partially obscured. Thus additional tests
beyond this process are usually required.
176

yv

N N
Vview
xv
camera Vview
zˆ v
zv only partly hidden

9.2 The depth-buffer (or Z-buffer) method


This is an image space method which checks the depth (z) value at each pixel position on the
projection plane. The method is usually applied to polygonal components in a scene, and each
surface is processed separately. Two buffers, a frame buffer (FB) and a depth buffer (DB)
are required.
Then we employ the following for depth values normalized to 0...1.0 and the view plane located
at depth = 0:

The depth-buffer algorithm

a. Initialize the FB and DB for all positions (x,y) by


DB(x,y) = 1.0; FB(x,y) = backgroundColor
b. For each polygon in the scene, one after the other, do:
i. If the depth z of each projected pixel position (x,y) is unknown, find it
ii. If z < DB(x,y), find surfColor(x,y) = surface color at (x,y) and set
DB(x,y) = z; FB(x,y) = surfColor(x,y);
c. When step b is done for all surfaces, then
DB has depth values for all the visible surfaces
FB has corresponding colour values for these surfaces

In practice calculations can be minimized by employing coherence properties:


For a given polygon in a scene, if the depth values at the vertices are known, then the depth
values at any other position in the plane containing the polygon may be calculated as follows:

Using the equation (9.1) = 0 for a plane, we obtain


− Ax − By − D
z= (9.4)
C
Then moving along pixels in scan lines and across adjacent scan lines the adjacent x-values
and adjacent y-values respectively differ by ±1.

Thus, if z is the depth value at (x,y) then the depth y


value z′ at (x+1,y) along the scan line y is given from y-1
(9.4) by
x x-1
− A( x + 1) − By − D
z′ = (9.5)
C
177

A
i.e. as z′ = z − (9.6)
C

Note that since –A/C is constant for the whole surface, succeeding z-values are found with just
one addition.
If we start with a depth value at a top vertex, we can calculate the depth values down an edge
as follows:

The beginning x-value on the next scan line is obtained from the starting x-value of the
previous scan line by (see fig. below)

1
x′ = x − ; m = edge slope
m

and the depth value down this edge is then given by


A/ m + B
z′ = z + (9.7)
C
which can be applied recursively.
top scan line

y adjacent scan
y-1 lines

bottom scan line


x x′

If for an edge m = 0 (infinite slope) we use (9.7) in the form


B
z′ = z + (9.8)
C

It turns out that even with the above coherence based approach, many unnecessary
calculations are performed in this method. Thus other, better techniques have been devised.

9.3 The A-buffer method


This is an extension of the Z-buffer method where for each position (x,y) are stored depth
values as well as multiple surface data (or a pointer to it). The latter can be used to compute
combination colour values, allowing one to set transparency levels and do anti-aliasing.

9.4 Scan-line method


Also an image space method, as each scan line is processed the projections of every surface
at a pixel position are checked to find the one nearest the view plane. The pixel value for that
surface is then set in the FB. See H&B p. 535 for more details.

9.5 The depth-sorting method (“painter’s algorithm”)


This method uses both image-space and object-space data. The main steps are:-
1. Sort surfaces in order of decreasing depth (in image and object space)
178

2. Scan convert the surfaces in order starting with the surface of greatest depth (in image
space)
The process is similar to the way in which an artist will paint on a canvas – first the background
and then the foreground over it. Here the colour values for the furthest surface are entered into
the frame buffer. Then for each succeeding surface we overwrite the FB values.
With the above simplistic description, the method can fail when two or more surfaces overlap in
the viewing direction ( zv ). Thus further, more refined tests are necessary to resolve such
issues. See H&B p. 538-540. A variation of this algorithm which sub-divides the surfaces into
constitutive triangles, and which then conducts depth tests on the triangles is given in
Ammeraal p. 145. The latter can be applied to non-polygonal surfaces which would then be
broken down into polygons (triangles).

9.6 Other methods


Various other methods have been devised and tested, such as the BSP-tree method, area-
subdivision method, octree methods, ray casting and so on. Each may have an advantage over
the others in specific situations. See H&B p. 540-545.

9.7 Curved surfaces


Curved surfaces may be approximated by regular more basic shapes such as triangles and
rectangles (i.e. a polygon mesh). Then the polygon methods above may be applied on the
basic pieces.

9.8 Curved surface representations


In many applications it is required to make plots of curved surfaces. Whilst such surfaces can
be composed of more basic ones, it is more accurate to use the parametric equations for the
surfaces directly or write them, if possible, in the solved form
z = f ( x, y )
and then plot the z-values as functions of the (x,y) coordinates. Quadratic approximations are
often used to model such surfaces. The aforementioned visibility tests can then be adjusted
and applied on these.

9.9 Contour plots


In scientific applications, an xy plot for fixed values of z is commonly employed, as contour
plots. This can be obtained by solving the surface equation as
y = f ( x, z )
and plotting y as a function of x for fixed values of z in some range. Hidden surfaces or
contours may also be removed as before.

9.10 Wire-frame visibility methods (hidden-line detection methods)


Sometimes it is required not to show or hide a surface, but to show or remove just its bounding
lines. Approaches here make use of depth cueing on the lines so that they may be
differentiated by colour or, remove the hidden lines entirely or show them in a different style.
In any case methods have to be designed to single out the hidden lines or parts of them. These
methods are similar to the line clipping algorithms, but also involve depth comparisons. Thus in
one approach, line edge positions are compared with surface boundaries to establish clipping
outside these areas, but in addition we compare the depth of the endpoints with the surface
depth and/or with their intersections with the surface.
179

In summary,
• If both line endpoints are behind the surface then the line is entirely hidden
• If both endpoints are in front of the surface, the line is entirely visible
• If one endpoint is behind and another is in front, we calculate the intersections of the
line with the surface, find the depth values there and compare with the endpoint depths
to determine which sections of the line are hidden or visible.

Line depth >, < surface depth,


Line depth > surface depth, so
so find where it intersects inside
find intersections at perimeters
and at perimeter

Depth cueing of wire-frame lines


The visibility of lines can also be controlled by varying its colour intensity as a function of depth
from the viewing position. Usually, a linear depth-cueing function such as the following is
applied:
d −d
f depth ( d ) = max (9.9)
d max − d min
where,
d = distance of point on line from viewing position (≥ 0)
d min = minimum depth
d max = maximum depth (these two are typically normalized to range 0...1)
Clearly, from (9.9), the nearer points (smaller d) will be displayed brighter than the further
points.

9.11 OpenGL visibility detection functions


1. glEnable(GL_CULL_FACE); //enables face culling
glCullFace(mode);
//mode=GL_BACK, GL_FRONT, GL_FRONT_AND_BACK
Can change front and back faces with glFrontFace(.)

2. glDisable(GL_CULL_FACE); //disables above

3. gutInitDisplayMode(GLUT_SINGLE | GLUT_RGB | GLUT_DEPTH);


//To use OGL depth-buffer routines, must request a depth-buffer and then

4. glClear(GL_DEPTH_BUFFER_BIT);
//Initialize depth-buffer values. Clears all values to max 1.0 in norm range
0...1.0. Should be done for every new frame display. Then use

5. glEnable(GL_DEPTH_TEST);
//To enable depth-test routines
180

6. glDisable(GL_DEPTH_TEST);
//Disables them

7. To clear max depth value to another ≠ 1.0 use


glClearDepth(maxDepth); //maxDepth value in 0...1.0 and follow with
glClear(GL_DEPTH_BUFFER_BIT); //clears to this non-default value

8. Projection coordinates in OGL normalized to range -1.0...1.0. Depth values


between near and far planes further normalized to 0...1.0. with 0.0 ~ near plane and
1.0 ~ far plane. Can set to a depth range within 0...1.0 with
glDepthRange(nearNormDepth, farNormDepth); //defaults are 0 and 1.0 resp.

9. glDepthFunc(testCondition); //see H&B p. 550

10. glDepthMask(writeStatus); //sets depth-buffer status to read & write with


writeStatus=GL_TRUE

11. glPolygonMode(GL_FRONT_AND_BACK, GL_LINE); //sets object display


//to wire-frame form
To remove the hidden lines we use the depth-offset method below:
• First we give the wire-frame object form in a foreground colour.
• Specify an interior fill with a depth-offset and background colour for
interior filling. Depth-offset prevents background colour fill from
affecting display of the visible edges

Example code fragment:

glEnable (GL_DEPTH_TEST);
glPolygonMode (GL_FRONT_AND_BACK, GL_LINE);
glColor3f (1.0, 1.0, 1.0); //white foreground
/* Invoke the object-description routine. */

glPolygonMode (GL_FRONT_AND_BACK, GL_FILL);


glEnable (GL_POLYGON_OFFSET_FILL);
glPolygonOffset (1.0, 1.0); //set depth-offset
glColor3f (0.0, 0.0, 0.0); //fill colour is black
/* Invoke the object-description routine again. */

glDisable (GL_POLYGON_OFFSET_FILL);

12. OGL depth-cueing to vary brightness of object as function of distance from view
point invoked with:
glEnable(GL_FOG); //selects display mode
glFogi(GL_FOG_MODE, GL_LINEAR); //applies (9.9) with d min = 0.0, d max = 1.0
Can set own min and max values for d by
glFogf(GL_FOG_START, minDepth); //minDepth = float in range 0..1.0
glFogf(GL_FOG_END, maxDepth); //maxDepth = float in range 0..1.0
181

CHAPTER TEN: LIGHTING AND SURFACE RENDERING

To obtain realistic graphical representation of scenes, we have to combine perspective viewing


with lighting effects. Typically, a single point’s colour on a surface is computed with some
illumination (or lighting) model. Then the surface is rendered by techniques such as
interpolation with a surface-rendering method. Illumination models are based on the laws
of Physics as well as on psychological considerations and take into account the nature of
the light source, the surface (on which it shines) material’s texture, levels of transparency or
opacity, shininess, and how humans perceive light.

10.1 Light sources


A light source is an object that is emitting light or radiant energy. Simple light source models
use single intensity values for each of the RGB components, but more complex models are
used as well. For CG we shall employ the following classes of sources.

Point light source


Here a single colour light (with 3 RGB components) is emitted along radial ray-paths equally in
all directions from a point:

This point position is used in the model to find the effect of the light rays on objects in the
scene.

Infinitely far source


This is usually a large source, like the sun, but can be treated as a point source and because it
is infinitely far away, rays from it reach objects along approximately parallel paths:
182

Intensity attenuation
From physics, it is known that the intensity (amplitude) of light drops by a factor 1/ d l2 where
d l is the distance from the light source. For a model using a point source, this can produce too
large a variation for scene parts near the source, but too little for parts that are far away. To
compensate for this effect, a linear term is added in the amplitude/intensity function to give the
form:
1
f radiation ( d l ) = (10.1)
a0 + a1d l + a2 d l2
where the values of the coefficients a0 , a1 , a2 can be suitably adjusted, even allowing for a
different set of values for each point in a multiple-point source. Further, since we cannot use
(10.1) for a source at infinity (since d l = ∞ ), knowing that all points in the scene would be
equally illuminated, we apply the adjusted form
1.0, if source is at infinity

f radiation ( d l ) =  1 (10.2)
 a + a d + a d 2 , if source is near(local)
 0 1 l 2 l

Directional or spotlight source

objects unlit here


θl cone axis
Vobj

α Vlight
CONE OF
INFLUENCE
θl
light source
objects unlit here

Here if Vlight is a unit vector in the direction of light from a local source, the light has a cone of
influence, with angular extent θ l on either side of an axis of a cone in direction Vlight . If α is
the angle between the unit vector Vobj pointing to its location (vertex) and Vlight then

Vobj ⋅ Vlight = cos α (10.3)

Restricting the cone’s angular extent to 00 < θ l ≤ 900 then object will be within the spotlight if
183

α ≤ θl i.e. if cos α ≥ cos θ l (10.4)


and the object is outside the cone if
Vobj ⋅ Vlight = cos α < cos θ l (10.5)
It’s also possible to model a multi-colour point-light source by using different direction
vectors Vlight with each corresponding to a different colour.

To allow for angular attenuation of a directional light source we can use an attenuation
function, such as for example,
f ang .atten (α ) = cosal α , 00 ≤ α ≤ θ (10.6)
where α is the angular distance from the cone axis and al is a positive attenuation coefficient
applicable when α > 0 (away from the axis where the intensity has a max value of 1.0). Then
to allow for both ordinary point sources and spotlights, we can employ the attenuation function

1.0, if source ≠ a spotlight

f l .ang .atten (α ) = 0.0, if Vobj ⋅ Vlight = cos α < cosθ l (obj. outside spotlight cone) (10.7)

( Vobj ⋅ Vlight ) , otherwise
al

Extended light source


To model a large light source close to the scene objects (e.g. a long neon bulb) we can use a
collection of directional point sources, where each may have a different direction. The Warn
model mimics studio lighting by such means together with flaps placed selectively to block the
light in certain directions.

10.2 Illumination models


These models compute the interaction between the incident light with the optical properties of
the material surface on which it impinges. Only approximate (empirical) models are considered
here, whilst more theoretical ones (based on optical physics) are also employed in the design
of some packages.

Ambient light
A simple model for ambient or background lighting is to set an overall intensity level parameter
I a say, for the whole scene, resulting in the equal illumination of all surfaces in the scene, in all
directions, producing equally scattered or diffuse reflections from the surfaces.

Diffuse reflection
As in above, surfaces that reflect light equally in all directions and independent of the viewing
position are called ideal diffuse reflectors. A more realistic model will allow for intensity of the
reflected light to be dependent on the incident light (from a point-source) intensity I l as well as
the direction of incidence to the surface. Thus another useful diffuse-reflection model is based
on the equation:

I l ,diff = kd I l cosθ (10.8)

where k d is a constant called the diffuse-reflection coefficient (or diffuse reflectivity) and θ the
angle of incidence.
184

N
N
θ
θ θ
incident L
light surface

Thus here, when the incident light hits the surface perpendicularly, θ = 00 , the diffuse
reflection has a maximum value of I l ,diff = k d I l . For angles away from zero, it decreases until
it reaches 0 at θ = 900 (grazes the surface with no reflection). For cosθ < 0 the source is
behind the surface.

More generally, if N is the unit normal at a position on the surface, and L is the unit vector
from this position to a point source then cosθ = N ⋅ L and the diffuse reflection equation for
single point-source illumination at a surface position can be written as
k I ( N ⋅ L), if N ⋅ L > 0
I l ,diff =  d l (10.9)
0.0, if N ⋅ L ≤ 0
When the point-source is not at infinity, i.e. it is a nearby one, then for its direction vector L
above we can use (with obvious meanings for the symbols),
P -P
L = source surf (10.10)
Psource -Psurf

It is possible to combine the effects of the ambient and point-light-source intensities so that the
total diffuse reflection due to a point-source in an ambient background light is

 k a I a + k d I l ( N ⋅ L), if N ⋅ L > 0
I diff =  (10.11)
 k a I a , if N ⋅ L ≤ 0
Typically, ka , k d are set in the range 0.0 ... 1.0 for monochromatic light and depend on the type
of surface material.

Specular reflection Phong model

N N N

L R L L
θ θ φ V
R R
surface
shiny surface: dull surface:
ns ~ 100 ns ~ 1

On some shiny surfaces a bright spot or localized area is noticed when light shines on the
surface. When viewed from a more oblique angle it is not noticeable. This effect is called
specular reflection. To model it we introduce a specular reflection direction vector R which has
the same angle ( θ ) w.r.t. the unit normal as the vector L but on the opposite side.
185

Now if V is the vector to the viewer, and φ the specular reflection angle of the viewer w.r.t. R
then for an ideal reflector (mirror), reflected light is seen only when φ = 0 i.e. when viewed
along V = R . For a non-ideal surface, there is a range of viewing angles around R inside of
which specular reflection is seen. The model due to Phong proposes specular reflection with
intensity proportional to cosns φ for angles 00 ≤ φ ≤ 900 with ns called the specular reflection
exponent (see fig. above for its effect). Thus we can write for the reflected intensity
I l ,spec = W ⋅ I l cosns φ ; W = const.
For real materials, W the specular reflection coefficient, is a function of the incidence angle
00 ≤ θ ≤ 900 described by Fresnel’s law of reflection. Thus we write for the reflected intensity

I l ,spec = W (θ ) ⋅ I l cosns φ ; 00 ≤ φ ≤ 900 , 00 ≤ θ ≤ 900 (10.12)

For graphs of W (θ ) for various materials see H&B p. 569. These show that for transparent
materials (glass-like) specular reflection is appreciable around θ ≈ 900 and very small for
other angles of incidence. For opaque materials, W (θ ) ≈ k s = constant for all θ . So, in
practice it is common to set W (θ ) ≈ k s = a value in the range 0.0 ... 1.0 for each surface. Thus
a simplified specular reflection intensity expression can be obtained by noting:
• N, L and R = unit vectors
• cos φ = V ⋅ R , cos θ = N ⋅ L
• if V and L on same side of normal N ⇒ no specular reflection
• for specular reflection must have cos φ = V ⋅ R > 0 and cos θ = N ⋅ L > 0
Then,
k s I l ( V ⋅ R ) ns , if V ⋅ R > 0 and N ⋅ L > 0
I l ,spec = 
0.0, if V ⋅ R < 0 and N ⋅ L ≤ 0 (10.13)

From the figure below

L N
N⋅L N H
L R
N⋅L L R V

it is clear that
R + L = 2( N ⋅ L) N ⇒ R = (2N ⋅ L) N − L . (10.14)

To apply the above, recall that L is calculated by taking a vector from the surface point to the
light source and then normalizing it. Similarly, V is got by taking a vector from the surface point
to the view point. Usually the latter is fixed so as an easy choice we set V = (0,0,1), the unit
vector in the +ve Z direction, as a good compromise.

A less compute-intensive amendment to the Phong model above is to use the halfway vector,
186

L+V
H= (10.15)
L+V
and replace V•R with N•H (see H&B p. 571 for an explanation).

Further, the basic models for diffuse and specular lighting may be combined to give a
composite model. In addition if there is more than one point light source we can form a
combined light source model by including additive terms for the individual ones. For example,
using (10.15) a combined diffuse, and n-source specular reflection model uses an intensity
function such as (see H&B p. 571)
n
I = I ambdiff + ∑ I l  kd ( N ⋅ L) + k s ( N ⋅ L) ns  . (10.16)
l =1
In the same way, it is also possible to include terms allowing for light emitters at various points
in the scene.

Colour
In the above, the intensity function is based on the assumption that we have monochromatic
lighting. To apply it to say, an RGB colour model, we employ a separate I function for each
component with correspondingly different constants kd , k s for each primary colour. Other
effects that can be incorporated into this basic model are transparency, opacity, translucency,
refraction and the effect of fog (haziness). See H&B p. 573-591 for a discussion of these
issues.

10.3 Polygon rendering methods


To render an entire surface, many graphics packages apply an illumination model to selected
points on the surface and then interpolate the effects at these points across the entire surface.
Typically, this process is applied by decomposing the surface into polygons, applying the
illumination model on its vertices and then linearly interpolating across each polygon along
scan lines. Another approach is to apply the illumination model to calculate the intensities at
each projected surface point using ray-tracing algorithms.

Constant-intensity (flat) surface rendering


Here we apply the illumination model at just one point on the surface and then use this value at
all points on the surface. This is the simplest method but will not give realistic effects in general.

Gouraud surface rendering


Devised by Henri Gouraud, the method linearly interpolates intensities computed at the vertices
of a polygon. For application to a regular or curved surface, the surface would be approximated
by a polygon mesh and then applied to each polygon face.
The procedure followed for each polygon is:-
1. Find the average unit normal vector at each vertex of that polygon, shared by other
polygons, if any.
2. Apply the illumination model at each vertex with normal as above, getting the intensity
there.
3. Linear interpolate the vertex intensities, across the surface of the polygon.

In 1 above, we obtain at a vertex V the average normal N V of n shared vertex normals N k


by
187

∑N k
NV = k =1
n
(10.17)
∑N
k =1
k

In 3 above, for each polygon face, the linear interpolation is done by considering the
intersections of scan lines with the polygon edges.

y
NV
3

1 p
V 4 5
2
x

For example, the intensity at the intersection point 4 is obtained by using only vertical
displacements via
y − y2 y − y4
I4 = 4 I1 + 1 I2 (10.18)
y1 − y2 y1 − y2
This applies to one of the 3 RGB components. So repeat for the other two with different I’s.
Similarly also we obtain I 5 .
Then the intensity for a point p along the scan line is computed from the equation
x − xp x − x4
Ip = 5 I4 + p I5 (10.19)
x5 − x4 x5 − x4
These calculations may be performed incrementally, as recurrence relations (see H&B p. 594)
for efficiency. For an example of how the method performs see H&B p. 594.

Phong surface rendering


This is a more accurate method, based on interpolating normal vectors rather than intensities.
The procedure followed for each polygon is:-
1. Find the average unit normal vector at each vertex of that polygon, shared by other
polygons, if any.
2. Linearly interpolate the vertex normals over the polygon surface.
3. Apply the illumination model at each point on the surface along scan lines using the
interpolated normal vector for each point.

The normal vector at a point of intersection with a scan-line intersection (e.g. 4 above) is
obtained by vertically interpolating the normals at 1 and 2 as before by
y − y2 y − y4
N4 = 4 N1 + 1 N2 (10.20)
y1 − y2 y1 − y2

We proceed similarly to the Gouraud method for interpolation along scan lines but here for
normals. Finally we apply the illumination model for each projected pixel position.
188

This method is more accurate but more time consuming. A faster version of it is also used (see
H&B p. 596).

10.4 Ray-tracing methods


No time to cover these – see H&B p. 597+.

10.5 OpenGL functions for illumination and surface rendering


See class handouts or H&B p. 637-660.

CHAPTER ELEVEN: COMPUTER ANIMATION

Computer animation is a process for simulating the time evolution of a scene and its depiction
on a viewing device. Such techniques are employed in the development of movie scenes,
computer games, scientific modelling, embedded video tracks in web pages, Java applets etc.

The methods used are broadly classified into two basic types:
• real-time animation where each stage of a sequence is displayed as it is created. The
screen refresh rate must be faster than the generation rate for compatibility. This is
generally employed for “simple” applications, except when user-interface controls are
also required (e.g. a flight simulator program).
• frame-by-frame animation where each frame of the motion is generated separately
and stored for later playback as a sequence. Generally used for large applications, e.g.
long movies and large-scale scientific simulations.

11.1 Raster methods


In raster systems, the time taken to refresh a screen may be much smaller than the time to
construct a frame (happens when many complex structures are created). Then, whilst a frame
is being constructed a part of it will be shown as updated portions, while later constructed parts
may be shown only briefly or none at all before the picture is cleared for renewal again. This
can result in erratic motions and even fractured frame displays. One way to minimize or avoid
this phenomenon is to make use of a second refresh buffer.

Double buffering
Here two refresh buffers are employed in alternation. Whilst one is offloaded to the screen,
the other receives the next frame construction from the processor. Then in the next cycle, the
latest updated buffer can be painted to the screen, whilst the first receives the frame
construction and so on. Which one is to be offloaded to the screen is signalled by some routine
call which swops their roles and is usually called at the end of a refresh cycle, which typically
189

takes 1/60 sec (i.e. 60 frames/sec painted), but can be called at will. If the frame being
constructed is completed before the other is offloaded, then the overall sequence is
synchronized with the refresh rate. Otherwise, if the frame construction time > 1/60 sec, then
two or 3 refresh cycles will occur from the same buffer, resulting in the same picture, before the
latest one gets offloaded. The animation sequence rates tend to be irregular when frame
construction times are multiples of the refresh time i.e. when they are near 1/60, 2/60, 3/60, ...
One way to minimize this effect is to put a delay in the construction sequence or we update
only some object motions at a time in the scene for each cycle.

Raster animation operations


Raster operations such as pixel block transfers can be used to create animation sequences.
Others use color-table-transformations (see H&B p. 735). Hardware coding or calls to low level
routines are also available on some systems.

11.2 Developing animation sequences


This can be a complex task in practice. The following steps are usually followed:
1. List the steps in the action sequence – can use sketches and descriptions (“storyboard
layout)”
2. Give the object definitions for objects to be used – e.g. object can be made up from
basic shapes like polygons, splines etc
3. Make detailed drawings of the scene at a certain times, outlining all the object positions
at those times (“key frame descriptions”)
4. Generate the in-between frames.

Since graphics terminals are refreshed at a rate of ±60 frames/sec a 1-minute animation
sequence would require about 60×60 = 3600 frames to be constructed. Typically, about 5 in-
between frames per key frame are used, so that 720 key frames are needed.

Constructing an animation sequence completely in a high-level language or interface can be


tedious. Thus, many packages or languages have been developed to ease the burden. Apart
from functions required in the high-level

Two important processes involved in animation sequences are changing object shapes
(“morphing”) and simulation of movement of objects (accelaration).

Examples of parts of animation sequences for morphing using linear interpolation are:

1 1'

added point 2'

2 3'

key frame k in-between frame key frame k+1

Evolving a single edge (1-2) into two connected ones


190

added
point

key frame k in-between frame key frame k+1

Evolving a triangle into a quadrilateral

A simple rule or algorithm for performing the above interpolation process is given in H&B p.
741, wherein are also presented more complex pictures of morphing like those seen in motion
pictures.

The simulation of acceleration can be achieved by curve fitting techniques, employing linear or
nonlinear functions. An example is

key frame k in-between frame key frame k+1 key frame k+2

.
Here given vertex positions at the key frames, we fit a curve through them. We then advance
these positions to the next frame through a series of in-between frames. If different positions
move at different speeds (along the red dotted paths) then the curves will deform as time goes
on. For constant speed or zero acceleration the time intervals between frames is set to some
constant value ∆t which can be computed from physical considerations. For accelerating
frames we can keep adjusting ∆t .

11.3 Simulating moving objects


If the scene or some objects in it evolve according to some physical law(s) then we can
determine their positions by solving the equations governing the motion and construct frames
at suitable time intervals separated by some value ∆t . A common occurrence is the evolution
according to the laws of mechanics, such as Newton’s 2nd law:
191

d
F= ( mv ) or Force = mass × acceleration (for m =const)
dt

For simulating the motion of living creatures (humans, animals etc) a common technique
involves constructing frames of articulated figures like:

associated limb movement

The articulated figure may be further enhanced by limb models as above, which are made to
bend at the joints as motion is simulated in succeeding frames.

11.4 OpenGL functions


The basic functions are:-
1. glutInitDisplayMode(GLUT_DOUBLE);
Sets up 2 buffers – a front buffer and a back buffer
2. glutSwapBuffers(); //interchanges roles of buffers
3. glGetBooleanv(GL_BUFFER, status); //Checks & returns GL_TRUE in array status
//if both buffers available, else returns GL_FALSE
4. glutIdleFunc(animationFcn);
Required for continuous animation, where animationFcn = function where params
incremented when no display window events being processed. With argument = NULL
or 0, glutIdleFunc(.) is disabled.

11.5 Program examples


11.5.1 A rotating hexagon: This program constructs and continuously rotates a regular
hexagon in the XY plane, about the Z axis. The origin is at the window centre and the
Z axis passes through it perpendicularly to the screen. The rotation is started by
pressing the left mouse button and stopped by pressing the right mouse button, which
invokes the glutIdleFunc(.).
//RotatingHex.cpp
//----------------------
#include <windows.h>
#include <GL/glut.h>
#include <math.h>
#include <stdlib.h>

const double TWO_PI = 6.2831853;


GLsizei winWidth = 500, winHeight = 500; // Initial display window size.
GLuint regHex; // Define name for display list.
192

static GLfloat rotTheta = 0.0;

class scrPt {
public:
GLint x, y;
};

static void init (void)


{
scrPt hexVertex;
GLdouble hexTheta;
GLint k;

glClearColor (1.0, 1.0, 1.0, 0.0);


/* Set up a display list for a red regular hexagon.
* Vertices for the hexagon are six equally spaced
* points around the circumference of a circle.
*/
regHex = glGenLists (1);
glNewList (regHex, GL_COMPILE);
glColor3f (1.0, 0.0, 0.0);
glBegin (GL_POLYGON);
for (k = 0; k < 6; k++) {
hexTheta = TWO_PI * k / 6;
hexVertex.x = 150 + (GLint) (100 * cos (hexTheta));
hexVertex.y = 150 + (GLint) (100 * sin (hexTheta));
glVertex2i (hexVertex.x, hexVertex.y);
}
glEnd ( );
glEndList ( );
}

void displayHex (void)


{
glClear (GL_COLOR_BUFFER_BIT);
glPushMatrix ( );
glRotatef (rotTheta, 0.0, 0.0, 1.0);
glCallList (regHex);
glPopMatrix ( );
glutSwapBuffers ( );
glFlush ( );
}

void rotateHex (void)


{
rotTheta += 3.0;
if (rotTheta > 360.0)
rotTheta -= 360.0;
glutPostRedisplay ( );
}

void winReshapeFcn (int newWidth, int newHeight)


{
glViewport (0, 0, (GLsizei) newWidth, (GLsizei) newHeight);
glMatrixMode (GL_PROJECTION);
glLoadIdentity ( );
gluOrtho2D (-320.0, 320.0, -320.0, 320.0);
glMatrixMode (GL_MODELVIEW);
glLoadIdentity ( );
glClear (GL_COLOR_BUFFER_BIT);
193

void mouseFcn (int button, int action, int x, int y)


{
switch (button) {
//case GLUT_MIDDLE_BUTTON: // Start the rotation.
case GLUT_LEFT_BUTTON:
if (action == GLUT_DOWN)
glutIdleFunc (rotateHex);
break;
case GLUT_RIGHT_BUTTON: // Stop the rotation.
if (action == GLUT_DOWN)
glutIdleFunc (NULL);
break;
default:
break;
}
}

int main (int argc, char** argv)


{
glutInit (&argc, argv);
glutInitDisplayMode (GLUT_DOUBLE | GLUT_RGB);
glutInitWindowPosition (150, 150);
glutInitWindowSize (winWidth, winHeight);
glutCreateWindow ("Animation Example");
init ( );
glutDisplayFunc (displayHex);
glutReshapeFunc (winReshapeFcn);
glutMouseFunc (mouseFcn);
glutMainLoop ( );
}

Three screen shots from it are shown below (rotation is counter-clockwise).

11.5.2 A bouncing ball: This is an example of motion determined by the laws of mechanics:-
In particular, a ball bouncing on a hard surface undergoes damped harmonic motion
according to a differential equation (Newton’s 2nd law) determining its height y(x(t)),
where x(t) is its X-displacement as a function of time, and whose solution for y is:

y ( x ) = A [sin( wx + θ 0 ) ] e − kx .
194

Here A is the amplitude, w the angular frequency, θ 0 a phase constant and k the
damping constant. The ball’s trajectory then looks like:

The following is a Java program that computes and simulates this motion
//Ball.java
//------------
//Simple bouncing ball animation without double buffering

import java.awt.*;
import java.awt.event.*;

public class Ball extends Frame


{
int cWidth = 600; //painting canvas dimensions
int cHeight = 400;

public static void main(String[] args)


{
new Ball(); //make frame object
}

// Frame class constructor definition


Ball()
{
//Frame window title:
super("Bouncing ball");
//add window closing listener to frame
addWindowListener(
new WindowAdapter()
{public void windowClosing(WindowEvent e)
{System.exit(0);}
}
);
//set canvas area
setSize(cWidth,cHeight);

//Create painted canvas and put in center


add("Center", new Ball_Canvas());
setVisible(true);
}
}
//-----------end constructor------------------
195

//Class for constructing canvas


class Ball_Canvas extends Canvas
{
int ypos=0; //ball's position, top left corner
int xpos = 0;
int bWidth = 40; //ball's bounding
int bHeight = 40; //area dimensions
int nImages = 50; //number of images in one L to R sweep
float yposF;
double amp0 = 200;
double amplitude;
int frequency = 3;
double pi = 3.14159;
double phase = pi/2.0;
double dampConst = 0.2;
double toRadians = pi/180.0;

public void paint(Graphics g)


{
//get canvas dimensions
Dimension d = getSize();
int maxX = d.width-1, maxY = d.height-1;
int floorHeight = maxY/5; //i.e 1/5 th of canvas height;
g.setColor(Color.red);

for (int i = 0; i <= nImages; i++)


{xpos = 10*i;
yposF = (float)Math.sin(phase+frequency*xpos*toRadians);
amplitude = amp0*Math.exp(- dampConst*xpos*toRadians);
ypos = Math.round((float)amplitude*yposF);
ypos = maxY - floorHeight - Math.abs(ypos);
//g.drawOval(xpos, ypos, 10, 10);
g.fillOval(xpos, ypos, bWidth, bHeight); //draw ball
delay(30000000); //hold image for some time
g.clearRect(xpos, ypos, bWidth, bHeight); //now clear for next image rect
//g.clearRect(xpos, ypos, d.width, d.height); //clears whole canvas
}
repaint(); //do over again starting from left
}

public void delay(long dConst)


{
for (int j = 0; j< dConst; j++); //something to do
}
}
196

Typical screen shot:

CONCLUDING APPENDIX

Subject areas omitted here:


Line clipping algorithms
Octree methods and ray tracing for hidden surface detection
Advanced surface rendering
Interactive techniques
Properties of light and colour models
Hierarchical modelling and utility packages

Other ways to do a graphics course:


Using DirectX (MS propriety interface)
Using Java 3D (low–level interface binding to OpenGL or DirectX – two versions available)
Using JOGL - joint Sun and Silicon Graphics Java native binding to OpenGL

Where to go from here:


Virtual reality course
Game programming
Scientific modelling and visualization
Research – surface modelling, hierarcical modelling, fractal methods and shape grammars,
VR, animation, applications

GOOD LUCK

Вам также может понравиться