Вы находитесь на странице: 1из 21

Performance of CMAQ

on a Mac OS X System
Tracey Holloway, John Bachan, Scott Spak
Center for Sustainability and the Global Environment
University of Wisconsin-Madison

A presentation to the 3rd annual CMAS Models-3


conference
October 19, 2004
Thinking different.
Motivation
Methods
Performance
Hardware
Release
Ongoing Improvements
Motivations.
Simplified operation
Easier development
Easy clustering
Improved performance
Motivation: Operation.
Single platform for all research and
academic computing
User-friendly interface
UNIX OS
Open source software, hardware support
Todays cluster node = tomorrows
desktop
Motivation: Development.
Better Developer
Tools
Xcode
(Interface Builder)
CHUD performance &
debugging suite

Distribution Tools
standardized profiles QuickTime and a
TIFF (Uncompressed) decompressor
are needed to see this picture.

PackageMaker
FAT binaries
automated installation
Operation & Development.
Motivation: Performance.
Unique Hardware Advantages
powerful PPC 970 vector chip
auto-vectorizing compilers
2000 NASA Langley report
Populist Parallelization
mix dedicated cluster nodes with free cycles on
personal & lab machines
off-the-shelf solutions
simple GUI and command-line tools

Quick Timea nd a
TIFF ( Unco mpre ssed ) dec ompr esso r
ar e nee ded to see this pictur e.
Methods.
IBM XL Fortan v8.1 compiler
auto-vectorization
equivalent to AIX
Modifications
flag conversion
build settings
array passing
> 400 man-hours
Performance.
2 Test Machines
dual 2 GHz G5, 5 GB RAM, 1 GHz bus
stock dual 1 GHz G4, 1.5 GB RAM, 133 MHz bus
Mac OS X 10.3.5
1 Test Run
First day of CMAQ 4.3 tutorial
1 day, 32 km x 32 km, 38 x 38, 6 layers
default EBI CB4 chemistry
Benchmarks.
Tutorial Runtime by Hardware and Compiler
(seconds)

IFC = Intel Fortan Compiler 7.1 seconds


PGF = Portland Group Compiler 4.0-2
Intel machines running CMAQ 4.22 on 2 processors with mpich parallelization. Source: Gail Tonnesen, Benchmarks for CPUs and Compilers for the CMAQ 4.2.2
release.
Chemistry.
Max | | from reference
Species Mean | | from reference
(% of cells >1 ppb)

O3 0.1282 ppb 4.52 ppb (0.43)


NO 0.0050 ppb 0.72 ppb (0)
NO2 0.0262 ppb 2.05 ppb (0.02)
NH3 0.0126 ppb 1.67 ppb (0.0002)
SO4 (I + J) 0.0284 g/m3 1.52 g/m3
Source: ACONC.nc output from Day 1 of CMAQ 4.3 tutorial
Dual 2 GHz G5 running CMAQ 4.3 on 1 processor
Good Chemistry.
Small difference from reference set
greater than difference among Intel machines and compilers
Noise, floating point calculations, initialization
greatest at surface level, early in run
ambient concentrations only
random distribution
no bias
does not propagate in time or space
not correlated to high or low concentrations

Consistent
G4/G5
chemistry modules
compiler flags
Better Chemistry.
Tutorial Runtime by Chemistry Module (seconds)

Dual 2 GHz G5 running CMAQ 4.3 on 1 processor


Models-3 on Mac,
10/04.
Core Platform Libraries & Add-Ons
MM5 (Fovell) netCDF v3.5.1
MCIP v2.2 mpich v1.2.2-6
Smoke v2.1 I/O API v2.2
CMAQ v4.3 MCPL

Currently no PAVE,
but Vis5d, VisAd, GrADS, NCL, and
Hardware.
Hardware.
Dedicated Cluster
18 G5 processors
XServe G5 Dual 2 GHz, 2 GB RAM
Xserve RAID 3.5 TB
8 Power Mac G5 Dual 2GHz, 5 GB
RAM

Distributed Capacity 42 G4 processors


student lab eMacs
personal G4 desktops

60 processor vector cluster


0 Full-time Sys-admins
Cost Competitive.
Apple
Xserve Dual G5 2GHz < $3500
RAID storage at $3 per GB
G5 Desktop $2000 - 4000
Compare to
Dell PowerVault RAID at $5 per GB
Dell Precision dual Xeon 2.8 GHz, $1200 -
4200
Sysadmin costs
JOHN

QuickTime and a

SCOTT
TIFF (LZW) decompressor
are needed to see this picture.
Release.
Following input from the CMAS Center
alpha code to CMAS by November, 2004
CMAS testing
potential support
Following CMAS Testing, preliminary code,
scripts, binaries, instructions
available for download at
www.sage.wisc.edu/cmaq
Scott Spak will answer questions for early
users: snspak@wisc.edu
Ongoing improvements.
Our planned activities
g95 - GNU compilation A community effort?
parallel implementations CMAQ Unified
Condor MIMS
Xgrid PAVE
Pooch/Appleseed
further optimization
Dual 2.5 GHz
benchmarks
CMAQ MADRID
Acknowledgements.
Mary Sternitzky, UW
Seth Price, UW
Hans Vahlenkamp and NOAA GFDL
Zac Adelman and the CMAS Help Desk
Dr. Gail Tonnesen and Glen Kaukola, UCR
Models-3 Listserv

All funding provided by the University of Wisconsin-


Madison.

Вам также может понравиться