Вы находитесь на странице: 1из 9



The scientic users computing demands are becoming increasingly complex and can benet from distributed resources, but effectively marshalling these distributed systems often introduces new challenges. The authors describe how researchers can exploit existing distributed grid infrastructure to get meaningful scientic results.

he scientific users computing demands are becoming increasingly complex: a scientific application, for example, can comprise simulations, database access, and interactive visualization for data analysis. Each task could require computing resources that are distributed geographically and come from several administrative domains. Compared to operating on a single machine, effectively marshalling these distributed systems introduces extra complications because users must pay attention to data transfer, job submission, resource allocation, and authorization across the federated resources. Grid computing middleware (such as the Globus Toolkit) simplies this problem by providing frameworks that abstract a tasks technical details, but much of this middleware is still under development and often demands a significant investment of time and effort to learn the required programming models and techniques. It will be some time before we can write highend scientific applications and code from scratch

1521-9615/05/$20.00 2005 IEEE Copublished by the IEEE CS and the AIP


University College London

using grid-programming models and frameworksfor example, a scalable, highperformance molecular dynamics (MD) code that can work on a single problem distributed over a grid. As it stands, potentially extensive code refactoring and interfacing is required to convert legacy programs into grid applications. Consequently, most scientific computation on grids represents rst-generation grid applicationstypically, existing program code minimally modified to exploit grid technologies. The degree to which we can modify the code varies, but essentially, applications retain the same programming paradigms. MPICHG2,1 for example, is an attempt to generalize the message-passing interface (MPI) over Globus grid resources, retaining the MPI interface but implementing it in a grid environment. The RealityGrid project (www.RealityGrid. org) has adopted a fast-track approach to scientic applications on grids. Although we understand the importance of developing careful, rigorous, and well-analyzed approaches to grid-application programming, we believe that it is at least as important to begin using the available grid infrastructure for scientic applications with pre-existing code to get meaningful scientic resultsthe hope is to create both a technology push and an applications pull. As a rst step, the RealityGrid project (which contains various subprojects including the TeraGyroid and Free Energy calculations project) has developed a software



Computational Steering API

ecause the ultimate aim of grid computing is to let users transparently perform tasks across a distributed computational resource without needing to be aware of how that resource is changing with time, there is a critical need for shielding the complex and changing underlying grid infrastructure from the end user. One way is with stable, generalpurpose APIs that provide a grid-programming mechanism independent of underlying detailsthe message-passing interface (MPI) specication and API provide such an API for message passing in parallel programming environments. The RealityGrid steering library and its API attempt to provide such an interface for a wide spectrum of grid applications taken from a range of application domains. To give a avor of the RealityGrid Steering API, lets walk through its application-side routines. Some of these library calls are self-explanatory, such as the routines required for startup and shutdown (Steering_initialize and Steering_nalize). (Details of these functions and all others are available elsewhere.1) The application uses the routine Register_params to register the list of parameters, both monitored (a steering client can read them) and steerable (a steering client can send them). The call has a cumulative effect: every successful call to the Register _params function results in the addition of the specied parameters to the list. Corresponding client-side operations can discover any registered parameters. The next set of routines provides the ability to input and output data from the simulations. Because an application might require several different types of input and output, each type must be registered by a call to Register_IOTypes, which associates a label with each type. The applications data output is initiated by a call to Emit_start, data is transferred by one or more calls to Emit_data_slice, and the operation is terminated with Emit_stop. These functions are analogous to fopen, fwrite, and fclose. Likewise, data input is performed via a series of calls analogous to fopen(), fread(), and fclose():in this case, Consume_start, Consume_data_slice, and Con-

sume_stop. Each call to Consume_data_slice is preceded by a call to Consume_data_slice_header, which sets the type and amount of data the application is about to receive. The application is unaware of the datas source or destination and provided with only the label to the library. Its the librarys responsibility to determine the datas source or destination and transport mechanism. The library uses emit and consume semantics because the application shouldnt be aware of the datas destination or source. Windback means revert to the state captured in a previous checkpoint without stopping the application. In the RealityGrid framework, its the applications responsibility to create a checkpoint le.2 The application gives the library an opportunity to handle events by making a call to Steering_control. One of the rst important decisions an application developer typically encounters is how to determine a logical breakpoint in the code where the application is in a coherent statethat is, the monitored parameters are up to date, changes to the values of steerable parameters are possible, and a consistent representation of the system state can be captured in a checkpoint le. For most scientic codes, this should be straightforward; for example, in molecular dynamics code, its appropriate for the time-step loop to begin with a call to Steering_control. The library automatically reports the values of monitored parameters to the steering client, updates the values of any steerable parameters that have changed, and noties the application of any additional operations it should perform. The application is required to process the additional commands (such as emit a checkpoint, emit a sample of a certain type, or stop).

1. S. Pickles et al., The RealityGrid Computational Steering API Version 1.1, RealityGrid tech. report, 2005; www.sve.man.ac.uk/Research/AtoZ/ RealityGrid/Steering/ReG_Steering_API_v1.2b.pdf. 2. K.R. Mayes et al., Towards Performance Control on the Grid, to be published in Philosophical Trans. Royal Soc. London A, vol. 363, no. 1833, 2005; www.pubs.royalsoc.ac.uk/philtransa.shtml.

infrastructure designed to enable existing high-end scientic application use on grids.2,3 In particular, we intend for this infrastructure to facilitate computational steering of existing codeletting the user inuence the otherwise sequential simulation and analysis phases by merging and interleaving them.

Framework for Grid-Based Computational Steering

The RealityGrid steering library, from the applications perspective, is its point of contact with

the grid world. In designing and implementing the steering library, we aimed to enable existing programs (often written in Fortran 90 or C and designed for multiprocessor supercomputers) to be rendered steerable with minimum effort. The steering library is implemented in C; has Fortran 90, Java, Python, and Perl bindings; and permits any parallelization technique (such as MPI or OpenMP) with the proviso that the application developer assumes responsibility for communicating any changes resulting from the steering ac-



Steering grid service bind


Components start independently and attach/detach dynamically

Steering library Client publish Data transfer (Globus-IO) Display publish Steering library bind Multiple clients: Qt/C++, .NET on PocketPC, GridSphere Portlet (Java) Steering grid service Visualization Display Display

Steering library find



Remote visualization through SGI VizServer, Chromium, and/or streamed to AccessGrid

Figure 1. Schematic architecture of an archetypal RealityGrid steering conguration.4 The components communicate by exchanging messages through intermediate grid services. The dotted arrows indicate the optional use of a third-party le transfer mechanism and the option of the visualizer sending messages directly to the simulation.

tivity to all processes. The steering library requires the application to register monitored and steered parameters. Similarly, the user can instruct the application to save its entire state to a checkpoint file, or restart from an existing one. (See the Computational Steering API sidebar for more details.) We use a system of reverse communication with the applicationthat is, the library notifies the application when emit, consume, checkpoint, and windback operations occur, thus delegating the tasks execution to the application. Figure 1 shows a schematic representation of our computational steering architecture for the case in which a visualization component is connected to a simulation component.4 A scientist might steer one or both of these components via the steering client. To create a distributed steered simulation, we launch the components independently and attach and detach them dynamically. The service-oriented middle tier solves numerous issues that are problematic for a two-tier architecture, particularly the problem of routing messages to components that reside on resources behind a rewall. By portraying the knobs (control) and dials (status) of computational steering as Web service operations, we can easily document the computational steering protocol in the Web Services Description Language, make it accessible in several other languages, and permit independent devel-

opment of steering clients. Moreover, these clients can be stand-alone, customizable, Webbased, or embedded in a modular visualization environment (such as AVS, www.avs.com; Amira, www.amiravis.com; or Iris Explorer, www.nag. co.uk/welcome_iec.asp). The mediating steering grid service (SGS) acts as a white board: the client pushes control commands toand pulls status information fromthe SGS, while the application performs the converse operations. The SGS is stateful and transient (with state and lifetime reflecting the component being steered), and is a natural application for the service data constructs as well as factory and lifetime management patterns provided by the Open Grid Services Infrastructure (OGSI).5 (Because the OGSI is now defunct, work is underway to transfer the SGS to the Web Services Resource Framework [WSRF].6) Steering clients can either use the client-side part of the steering API to communicate with the SGS or directly drive operations on the SGS using standard Web service methods.

Problems involving fluid flow are conventionally treated with the Navier-Stokes equations, a set of partial differential equations describing the evolution of a uids pressure and velocity. These equations are notoriously nonlinear and difficult to solve analytically, so theyre usually discretized and



solved numerically. This is nonetheless a continuum approach: the uid is treated as an arbitrarily nely divisible material, with no information about the behavior of its constituent particles. Hence, it is insufficiently detailed for problems in which uid molecules exhibit behavior more complicated than bulk ow. Unfortunately, the MD approach of integrating Newtons equations of motion for each molecule would be far too expensive for systems in which bulk flow remains important: a single cubic centimeter of water contains approximately 1022 molecules, which is 16 orders of magnitude larger than the largest MD simulations currently possible. The lattice-Boltzmann (LB) method7 is a member of a family of techniques called mesoscale methods operating at a kinetic description level more coarse-grained than MD, but more finely grained than Navier-Stokes. The LB method, which we can view as a discretized form of the Boltzmann transport equation of kinetic theory,8 is a cellular automaton technique that can contain information about the overall distribution of molecular speeds and orientations, without describing each individual molecule. LB is a convenient formulation of fluid dynamics. Because the algorithm is extremely simple and fast to implement, it doesnt require that we solve the Poisson equation at each time step, unlike conventional fluiddynamics techniques, nor does it require any complicated meshing. (All the simulations mentioned in this article were performed on regular cubic grids.) As such, the LB method is now competing with traditional computational fluid dynamics (CFD) methods for turbulent flow modeling; in fact, a company called Exa (www. exa.com) is now selling an LB-based system for aerodynamic modeling to the automotive industry. Moreover, the LB methods cellular automaton nature makes it extremely amenable to implementation on parallel computing hardware. Amphiphile mesophases are a good example of fluid systems that we cant treat with a continuum model, yet in which flow effects are still important. An amphiphile is a molecule composed of a water-loving head part and an oil-loving tail partsoap molecules are amphiphiles and are attracted to oilwater interfaces (each part of the molecule can sit in its preferred environment). If amphiphiles are dissolved in pure water, there is no oil to attract the tail groups, so the amphiphile molecules cluster together with the tail groups, shielded from the water by the head groups. The geometry of these clusters depends on a range of

parameters, such as concentration, temperature, or pH level. They can range from simple spherical micelle clusters to linear or branching wormlike micelles, irregular sponge phases, or highly regular liquid-crystal phases, which can consist of interpenetrating oil and water channels separated by a monolayer of amphiphile, or water channels separated by an amphiphile bilayer. Modeling or otherwise trying to understand these phases structure and behavior by considering individual molecules is extremely difficult, given that approximately 2,000 to 3,000 atoms can exist in a single micelle. Researchers have had some success9 in modeling amphiphile mesophases with free-energy techniques, but these studies yield little dynamical information, whereas mesoscale fluid algorithms such as LB are ideally suited to such problems. The TeraGyroid project, a collaboration between RealityGrid and many other groups, is built on RealityGrids earlier smaller-scale simulations of the spontaneous self-assembly of a structure called a gyroid, from a mixture of oil, water, and amphiphile. This structure has the curious property that the amphiphile molecules sit on a minimal surface with zero mean curvature at all points. The gyroid is also closely related to structures found in living cells, and it has much potential to create novel materials and photonic crystals. Previous simulations showed that it was possible to model the self-assembly process using an LB technique, but the simulations quickly became unphysical due to the finite-size effects. Consequently, the simulations tended to contain a single homogeneous region of the mesophase, in contrast to the multiple-grain structure we might expect in a real-world system. We predicted that at larger system sizes and over longer simulation runtimes, we could simulate the selfassembly of several spatially distinct domains: differently oriented gyroid structures would assemble and, where they met, defect regions would exist, analogous to domain walls in ferromagnets. To investigate these defect regions, we required simulations on grids of up to 1,0243 lattice points. The aim of the TeraGyroid project was to use grid computing techniques and resources to perform these very large-scale calculations, which would otherwise have been impossible.

TeraGyroid Technical Details

The principal problem the TeraGyroid project faced was the very large system sizes (as well as time scales) we had to simulate to reach the physical



regime of interest. This challenge demanded raw computing power as well as visualization, storage, and management facilities.
Memory and CPU

Computational Steering

The simulation program, LB3D, required approximately 1 Kbyte of memory per lattice site. Hence, a system with a lattice of 1283 sites required roughly 2.2 Gbytes of total memory to run, a single system checkpoint required 2.2 Gbytes of disk space, and each block of visualization data, containing a floating-point number for each lattice site, took approximately 8 Mbytes. The largest system, at 1,0243, produced checkpoint files approaching a terabyte and emitted visualization blocks of 4 Gbytes. The massive simulation data sizes meant that distributed-memory multiprocessing was the only option, but it caused significant headaches because restarting the simulation from an on-disk checkpoint could take more than an hour.

At the outset of the project, a typical workstation could render up to 643 data sets using either isosurfacing or volume rendering. A machine with a GeForce3 Ti200 graphics processor and an AMD Athlon XP1700 CPU can render a 643 gyroid isosurface containing 296,606 triangles at approximately 5 frames per second (fps). Larger data sets would take several minutes of preprocessing and then require several seconds to render each frame, resulting in a jerky display that makes user interaction and region-of-interest location rather difficult. Furthermore, data sets of 2563 sites or larger sometimes simply wouldnt fit into the available graphics memory on a single workstation. The project therefore required parallel rendering techniques. TeraGyroid was a collaborative project involving many institutions and people in the UK and the US. All these people needed to be able to see what was happening during the simulation runs and to discuss the running simulations with each other. In addition, the project team wanted to demonstrate the steered simulations to an audience at the Supercomputing 2003 conference and to a worldwide audience at the SCGlobal teleconference. These requirements necessitated a federated grid, joining separate computational grids across the Atlantic Ocean to form a single environment in which scientists would have enough computational power, visualization ability, storage capacity, and collaboration facilities to perform the TeraGyroid simulations.

Because of the nature of the calculations and the resource constraints imposed on them, we expected that the TeraGyroid project would be an ideal candidate for computational steering. But due to a limited theoretical understanding of the problemparticularly of its dynamic naturewe didnt know beforehand how long the simulations would have to run before reaching a gyroid domain regime, how long that regime would last, or even which regions of the simulation parameter space would permit gyroids to form. We hoped that computational steering would let us monitor simulations as they ran so that we could terminate those that crashed or entered uninteresting regimes early on and adjust those that showed particularly interesting results to give them higherresolution data output. The resources allocated to the project consisted largely of time slots (ranging from six hours to three days) during which all (or a large part) of certain machines would become completely available to the project. In a conventionally operated batch-queuing system, researchers dont have to worry about trying to fill up a whole machine with simultaneous tasksa sufciently well-designed queuing system will simply let other peoples tasks take up unused resources. This wasnt the case for TeraGyroid. The combination of the somewhat idiosyncratic resource allocation and the steered jobs meant that, at any time, calculations could run out of time on one machine, while processors on another machine (potentially on another continent) were freed up by a terminating job. This led to the desire for migratable simulations, which could move (or at least, be moved) from machine to machine to make the best use of the available processors.
Organizational and Practical Details

When steering a simulation, it became common practice to create a checkpoint le immediately before performing the steering action. This meant that if the steering caused the simulation to crash, it could be rewound back to its last known valid state and the calculation could continue from there. Although this approach makes simulation crashes much less of a hindrance when performing steered parameter searches, it tends to produce a large number of checkpoint leskeeping track of these can involve a lot of administrative work. The projects research team dealt with this administrative overhead by using the Checkpoint Tree Grid Service, developed by Manchester Computing (www.mcc.ac.uk). This service keeps a record of each generated checkpoint along with



which simulation it was generated from and which steering actions led to its generation. Effectively, this means that researchers can replay any simulation from a checkpoint earlier in its life, removing the administrative burden from the scientist performing the steering. Ensuring that all the required software ran smoothly on the required platforms requires a signicant amount of effort. A major problem with using a heterogeneous grid is that the location and invocation of compilers and libraries differ widely, even between machines of the same architecture. Environmental parameters, such as the location of temporary and permanent lespace, le retention policies, or executable paths, also vary widely. During the project, we dealt with these issues via ad hoc shell scripts, but this isnt a satisfactory solution because of the amount of work it requires. We formed the TeraGyroid testbed network by federating separate grids in the UK and US: each grid had to recognize users and certificates from other grids. During the project, we dealt with this challenge by communicating with individuals directly and by posting the certicate IDs that had to be recognized on a Wiki. Again, this isnt a scalable long-term solution, and ideally, the issue should be dealt with via third-party certificate management system. Given that grid computing requires transparent operation across multiple administrative domains, its actually questionable whether we can even apply the term grid to the TeraGyroid project. Certificate management and a public key infrastructure (PKI) are still difficult technical problems10 for the project, although were investigating new approaches to both.11,12 We also made much use of dual-homed systems, with multiple IP addresses on multiple networks, but this caused problems due to the tendency of authentication systems such as SSL to confuse host identity with IP address, requiring ugly workarounds. Most networking software assumes a homogeneous network and delegates routing control to much lower levels. This delegation makes it difcult, for example, for a client process running on one host to move files between two other hosts using a specific network, especially when we constructed a high-bandwidth network specically for the transfer. We also encountered problems when the computing and visualization nodes werent directly connected to the Internet; instead, they communicated through rewalls, which is a common situation on large clusters. We used workarounds such as portforwarding (forwarding network connections from a machine outside the rewall to a machine inside)

and process pinning (ensuring that a particular part of the simulation program always runs on a particular physical machine) during the project, but again, these arent good long-term solutions. The full TeraGyroid simulation pipeline requires certain resourcessuch as simulation, visualization, and storage facilitiesand AccessGrid virtual venues, to be available simultaneously. Systems administrators usually handled this requirement by manually reserving resources, but the ideal solution would involve automated advance reservation and coallocation procedures.

Accurate Free Energies in Biomolecular Systems

Computing the free energies (FEs) of biomolecular systems is one of the most computationally demanding problems in molecular biologyand arguably, one of the most important. The requirement to compute FEs is widespreadfrom a need to understand a candidate drugs interaction with

A major problem with using a heterogeneous grid is that the location and invocation of compilers and libraries differ widely, even between machines of the same architecture.

its target to understanding fundamental transport processes in nature. We can use several numerical approaches to compute an FE, so in addition to the scientific basis of determining which approach to use, we must decide which approach to adapt so as to effectively exploit grid capabilities. Here, we discuss two instances (from other projects that are part of the RealityGrid project) of using computational grids to compute FE differences in biological systems. Both examples can be naturally partitioned into several independent simulations, each requiring tightly coupled parallel resources; hence, they are amenable to a grid-based approach. Because we can distribute each simulation around the grid, its easy to use more high-performance computing (HPC) resources than are available in a single administrative domain at any given instant. As a result, we can dramatically reduce the time it takes to reach a solution. Our solutions to both these problems involve using a traditional parallel MD application that weve adapted to take advantage of grid execution



and frameworks. They adapt the legacy code for a grid framework and, as a consequence of using the grid, employ a different workflow. Thus in addition to the scientific interest, problems involving FE computation are, among other things, good candidates for determining a scientific grid computations effectiveness.
Understanding ProteinPeptide Binding

Gabriel Waksman and his colleagues elucidation of the crystal structure of the Src SH2 domain13 initiated a significant research effort directed toward understanding how these proteins bind and recognize specific peptide sequences. Measuring FE differences ( G) and understanding the thermodynamics of SH2-mediated binding is vital to understanding fundamental physical processes at play at the molecular level. Thermodynamic integration (TI) provides a formalism to compute

The acceleration over the conventional approach stems from the twin abilities to supply increased resources and to allow the simple and uniform interaction with multiple simulations.

in principle, exactlythe FE difference between two molecules A and B ( GAB) as they bind to a given SH2 protein domain. The key concept in TI14 is that of a thermodynamic cyclevarying the value of from 0 (peptide A) to 1 (peptide B). The computation of GAB can be transformed to one of computing the difference between two single G values, which in turn are individually evaluated over a set of intermediate values of . The amount of computational resources needed to compute a single TI is usually very large. Although this varies depending on the number of atoms involved in the alchemical mutation and the size of both the bound and unbound systems, we believe the computational cost has been the primary reason for the limited adoption of this technique.15 We use a grid infrastructure similar to the one we discussed in the TeraGyroid sectionboth in terms of hardware and software detailsto calculate a difference in binding FE via TI on a grid. Lets outline the grid workow to highlight how it differs from the standard application of this technique. Initially, a single simulation, usually at a low value of , is launched by the scientist on an HPC

resource. He or she monitors the simulation and assesses when to spawn a second simulation with the next value of based on a suitably determined convergence criterion. Depending on the exact details of the problem, this could vary from tracking a simple parameter to more complex data analysis. When the scientist decides to spawn a new simulation, a checkpoint is created and the new simulation with the next value of is started on another (or possibly the same) HPC resource. The original parent simulation continues for a suitable duration accumulating data, and the scientist monitors the newly spawned simulation in exactly the same manner to assess when to spawn subsequent simulations. The scientist monitors and controls each of the simulations, the number of which is constrained by the resources available within the grid. By comparison, a regular TI performed using MD employs a serial workflow;16 each simulation runs to completion before the next is launched, often on the same machine. The acceleration over the conventional approach stems from the twin abilities of computational grids to supply increased resources and to allow the simple and uniform interaction with multiple simulations, wherever theyre running. Although running many simulations concurrently in different administrative domains without the use of grid technology is possible in principle, in practice, it requires an enormous effort to cope with the heterogeneity of the different computers used. The aim of grid middleware is to shield the user from these complexities, leaving him or her free to interact with the simulations as if they were running locally. We can greatly enhance a computational solutions effectiveness and impact by reducing the time it takes to achieve a result to the same as (or less than) the time it takes to do a physical experiment. Weve previously shown17 that using a modified workflow and computational grids makes computing a binding energy qualitatively similar to the experimental time scales of two to three days. Admittedly, we achieved this for small systems; were looking into how efficient our approach is for larger systems and the effect of the spawning criteria on the net computational cost. Given the scalability of code we used, we have every reason to think the turnaround times of large models would be similar, provided the required number of tightly coupled HPC resources is available.
Computing Free Energy Proles

To understand the mechanism of biomolecular



translocation through a pore, researchers need the FE profiles that biomolecules such as mRNA and DNA encounter when inside protein nanopores, which, unlike the difference in FE binding, isnt just a single number. The time scale for DNA translocation is on the order of tens of microseconds, so simulating such long time scales for large systems (about 300,000 atoms) is impossible with todays standard MD approaches. Jarzynskis equation provides a powerful means of computing the equilibrium FE difference by applying nonequilibrium forces.18 A steered MD (SMD) simulation can be used to apply an external force, and, consequently, the physical time scale we can simulate increases. SMD simulations thus provide a natural case for applying Jarzynskis equation to a molecular system, and hence, we refer to the combined approach as the SMD-JE approach. A single, detailed, long-running simulation over physical time scales of a few microseconds would currently require months to years on one large supercomputer to reach a solution. SMD-JE lets us decompose the problem into a large number (an ensemble) of simulations over a coarse-grained physical time scale with a limited loss of detail. Thus, multiple nonequilibrium SMD-JE simulations of several million time stepsthe equivalent of a several-nanosecond equilibrium simulation in terms of computational requirementscan help us study processes at the microsecond time scale. The combined SMD-JE thus represents a novel algorithmic advance in computing FEs.19 In spite of algorithmic advances, the computational costs of such an approach remain prohibitive. To the best of our knowledge, researchers have never used the SMD-JE approach on problems of the size and complexity needed to address the transport of DNA in protein nanopores. Using a grid infrastructure for this problem, in addition to providing uniform access to several multiple replicas, facilitates effective coupling of large-scale computational and novel analysis techniques. The grid approach to SMD-JE lets researchers decompose the problem both by nature and design. Were currently working to compute the complete FE prole. SMD-JE is an approach based on novel physical insight that gives us the ability to decompose a large problem into smaller ones of shorter duration. This becomes particularly effective when used with a grid infrastructure, which provides an environment that enables uniform access to, as well as launching, monitoring, and steering of, numerous application instances over many geographically distributed resources. Its unclear

whether the SMD-JE approach will let us compute the FE profile of DNA translocation through a nanopore, but our ability to even attempt to address this question with problems of such size and complexity is a direct result of the synergy between state-of-the-art high-performance platforms and advances in grid infrastructure and middleware.

t a certain level, weve encountered a demo-paradox: scientists want to get down to the business of doing science, but before they can use the grid for routine computational science, a lot needs to be done to make the infrastructure stable. In our experience, demonstrations play a signicant role in ironing out the wrinkles, but by their very nature, demonstrations tend to be transient working solutions and thus are a distraction from the main purpose of doing science. The grid is there, and people want to use it now! Science cant wait for grid engineers and computer scientists to get every detail right. Its here that a two-pronged approachfast versus deephas helped us. At one level, we can work toward a rapid utilization of available infrastructure and technology; at another, we can feed our experiences and advice to resource providers like the US TeraGrid and UK NGS or to standards bodies like the Global Grid Forum (www.ggf.org) to further longer-term research goals for everyone. (See upcoming, related literature for more current details.20) But if there is a single lesson weve learned in our endeavors over the past few years, its that grid middleware and associated underlying technologies can and will change. The availability of robust, extensible, and easy-to-use middleware is critical to scientific grid computing efforts in the long term. Well-designed middleware should help us introduce new technologies without requiring that we refactor the application code (or the application scientist). As application-driven switched-lightpath networks become more widespread, for example, the grid middleware should interact intelligently with the optical control plane to facilitate end-toend optical connectivity, enabling it to become just another resource,21 like CPU and storage. The development of such middleware will hopefully receive a boost as more scientists, motivated by success stories, sense the advantage of attempting to use the grid for routine science.

This work is based on the efforts of many people over



several years. In particular, we thank Stephen Pickles and his group at the University of Manchester for their work on the computational steering system and Jens Harting and Philip Fowler for their work on the TeraGyroid and grid-based TI projects. Were grateful to the Engineering and Physical Sciences Research Council (www.epsrc.ac.uk) for funding much of this research through the RealityGrid grant GR/R67699. Our work was partially supported by the US National Science Foundation under the National Resource Allocations Committee (NRAC) grant MCA04N014. We used computer resources at the Pittsburgh Supercomputer Center, the US National Computational Science Alliance, the TeraGrid, and the UK National Grid Service.

tions: Use of Reverse Cumulative Averaging to Determine the Equilibrated Region and the Time Required for Convergence, J. Chemical Physics, vol. 120, no. 6, 2004, pp. 26182628. 17. P. Fowler, S. Jha, and P. Coveney, Grid-Based Steered Thermodynamic Integration Accelerates the Calculation of Binding Free Energies, to be published in Philosophical Trans. Royal Soc. London A, vol. 363, no. 1833, 2005; www.pubs.royalsoc.ac.uk/ philtransa.shtml. 18. C. Jarzynski, Nonequilibrium Equality for Free Energy Differences, Physical Rev. Letters, vol. 78, 1997, p. 2690. 19. S. Park et al., Free Energy Calculation from Steered Molecular Dynamics Simulations Using Jarzynskis Equality, J. Chemical Physics, vol. 119, no. 6, 2003, p. 3559. 20. P.V. Coveney, ed., Scientic Grid Computing, to be published in Philosophical Trans. Royal Soc. London A, vol. 363, no. 1833, 2005; www.pubs.royalsoc.ac.uk/philtransa.shtml. 21. G. Karmous-Edwards, Global E-Science Collaboration, Computing in Science & Eng., vol. 7, no. 2, 2005, pp. 6774.

1. N.T. Karonis, B. Toonen, and I. Foster, MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface, J. Parallel Distributed Computing, vol. 63, no. 5, 2003, pp. 551563. 2. J. Chin et al., Steering in Computational Science: Mesoscale Modelling and Simulation, Contemporary Physics, vol. 44, no. 5, 2003, pp. 417432. 3. J.M. Brooke et al., Computational Steering in RealityGrid, Proc. UK E-Science All Hands Meeting, 2003; www.nesc.ac.uk/ events/ahm2003/AHMCD/pdf/179.pdf. 4. S.M. Pickles et al., A Practical Toolkit for Computational Steering, to be published in Philosophical Trans. Royal Soc. London A, vol. 363, no. 1833, 2005; www.pubs.royalsoc.ac.uk/ philtransa.shtml. 5. M. McKeown, OGSI::LiteA OGSI Implementation in Perl, 2003; www.sve.man.ac.uk/Research/AtoZ/ILCT. 6. M. McKeown, WSRF::LiteA WSRF Implementation in Perl, 2003; http://vermont.mvc.mcc.ac.uk/WSRF-Lite.pdf. 7. S. Succi, The Lattice Boltzmann Equation for Fluid Dynamics and Beyond, Oxford Univ. Press, 2001. 8. P. Lallemand and L.-S. Luo, Theory of the Lattice Boltzmann Method: Dispersion, Dissipation, Isotropy, Galilean Invariance, and Stability, Physics Rev. E, vol. 61, no. 6, 2000, pp. 65466562. 9. J.M. Seddon and R.H. Templer, Polymorphism of Lipid-Water Systems, Elsevier Science, 1995, pp. 97160. 10. P. Gutmann, PKI: Its Not Dead, Just Resting, Computer, vol. 25, no. 8, 2002, pp. 4149. 11. B. Beckles, Removing Digital Certicates from the End Users Experience of Grid Environments, Proc. UK E-Science All Hands Meeting, 2004. 12. P. Gutmann, Plug-and-Play PKI: A PKI Your Mother Can Use, Proc. 12th Usenix Security Symp., Usenix Assoc., 2003, pp. 4558. 13. G. Waksman et al., Crystal Structure of the Phosphotyrosine Recognition Domain SH2 of v-src Complexed with TyrosinePhosphorylated Peptides, Nature, vol. 358, 1992, pp. 646653. 14. A.R. Leach., Molecular Modelling: Principles and Applications, 2nd ed., Pearson Education, 2001. 15. C. Chipot and D.A. Pearlman, Free Energy Calculations, The Long And Winding Gilded Road, Molecular Simulation, vol. 28, 2002, pp. 112. 16. W. Yang, R. Bitetti-Putzer, and M. Karplus, Free Energy Simula-

Peter V. Coveney is a professor in physical chemistry and director of the Centre for Computational Science at University College London, where he is also an honorary professor of computer science. His research interests include theoretical and computational science; atomistic, mesoscale, and multiscale modeling; statistical mechanics; high-performance computing; and visualization. Coveney has a DPhil in chemistry from the University of Oxford. He is a fellow of the Royal Society of Chemistry and the Institute of Physics. Contact him at p.v.coveney@ucl.ac.uk. Jonathan Chin is an engineering and physical sciences council research fellow at the Centre for Computational Sciences at University College London. His research interests include mesoscale modeling, visualization, and grid and high-performance computing. Chin has an MSci in physics from the University of Oxford. Contact him at jonathan.chin@ucl.ac.uk. Shantenu Jha is a postdoctoral research fellow at the Centre for Computational Science at University College London. His research interests include grid computing and computational physics. Jha has degrees in computer science and physics from Syracuse University and the Indian Institute of Technology, Delhi. He is a member of the Global Grid Forum and the American Physical Society. Contact him at s.jha@ ucl.ac.uk. Matthew Harvey is an engineering and physical sciences council postdoctoral research fellow at the Centre for Computational Science at University College London. His research interests include grid computing and combinatorial chemistry. Harvey has degrees in astrophysics and information technology from University College London. Contact him at m.j.harvey@ucl.ac.uk.