Академический Документы
Профессиональный Документы
Культура Документы
Abstract scheduler to decide where jobs should be sent; and the net-
work which connects the sites. For a grid with automated
In large-scale grids, the replication of files to different
file replication, there must also be a component to perform
sites and the appropriate scheduling of jobs are both im-
the replica management. It should be easy to investigate
portant mechanisms which can reduce access latencies and
different algorithms for both scheduling and replication and
give improved usage of resources such as network band-
to input different topologies and workloads.
width, storage and computing power. In the search for
an optimal scheduling and replication strategy, the grid
simulator OptorSim was developed as part of the Euro- Architecture
pean DataGrid project. Simulations of various high energy OptorSim is designed to fulfil the above requirements,
physics (HEP) grid scenarios have been undertaken us- with an architecture (Figure 1) based on that of the EDG
ing different job scheduling and file replication algorithms, data management components. In this model, comput-
with the emphasis being on physics analysis use-cases. An
economy-based strategy has been investigated as well as Users
more traditional methods, with the economic models show- Grid Job
INTRODUCTION
Grid Job
Grid Site Grid Site
The remit of the European DataGrid (EDG) [1] project Computing Computing
Element Element
was to develop an infrastructure that could support the in-
tensive computational and data handling needs of widely
distributed scientific communities. In a grid with limited Replica Manager Replica Manager
which have been used least recently, and the LFU (Least Bristol 155M 622M Moscow
used least frequently in the recent past. Thirdly, one can 2.5G
622M
2.5G
FNAL 10G
use an economic model in which sites “buy” and “sell” files 10G USA1 France Switzerland
155M
using an auction mechanism, and will only delete files if 622M USA2
10G
2.5G
10G
10G Padova/Legarno
USA3 155M
they are less valuable than the new file. Details of the auc- Caltech
622M
2.5G 2.5G 1G Italy
45M Bari
Lyon
1G CERN
100M
Perugia
100M
be found in [6]. There are currently two versions of the UFL 45M
45M
economic model: the binomial economic model, where file CMS testbed site Torino
100M 100M
Catania
Router
values are predicted by ranking the files in a binomial dis- Pisa
155M Roma
Firenze
tribution according to their popularity in the recent past, Bologna
Implementation For the CMS testbed, CERN and FNAL were given SEs
TM
In OptorSim, which is written in Java , each CE is rep- of 100 GB capacity and no CEs. All master files were
resented by a thread, with another thread acting as the RB. stored at one of these sites. Every other site was given 50
The RB sends jobs to the CEs according to the specified GB of storage and a CE with one worker node. For the
scheduling algorithm and the CEs process the jobs by ac- LCG testbed, resources were based on those published by
cessing the required files, running one job at a time. In the the LCG Grid Deployment Board for Quarter 4 of 2004 [7],
current implementation, the number of worker nodes for but with SE capacities reduced by a factor of 100 and num-
each CE simply reduces the time a file takes for process- ber of worker nodes per CE halved, to achieve useful results
ing, rather than allowing jobs to run simultaneously. When 1 Compact Muon Solenoid, one of the experiments for the Large
TRIUMF
ALBERTA
CARLETON
OTT
SCOTGRID
NORTHGRID SINP
JINR
if a file is local there are no transfer costs involved. For
SOUTHGRID SE RU
ED
TORONTO RAL
LONDON KR scheduling algorithms which consider the transfer costs,
VAN MON MONTREAL PL WARSAW
WUPP
GB
FZK
DESY
CAL WIN CZ CESNET
SEA NL
BERLIN
NY RWTH
CGG EKP
BEIJING LPC GSI
CN CHI LYON FR DE HU
CA SCAI
CERN
FNAL
LIP
USC
IFCA
PIC
UB
CH
CSCS AT
UIBK
BUDAPEST
RESULTS
JP
ES CNAF
CIEMAT
UAM IT HELLASGRID
ICEPP CNB ROMA
NCU
TAIPEI
TW
ASCC
PAKISTAN INTA IFIC MILANO
LEGNARO
FRASCATI
NAPOLI
GR AUTH
CMS Data Challenge 2002 testbed
TORINO WEIZMANN
10 Gbps Tier−0 site
3 Gbps IL TEL AVIV
2.5 Gbps
1 Gbps
Tier−1 site
Tier−2 site With the CMS testbed, three of the replication al-
622 Mbps Router
155 Mbps
33 Mbps
gorithms (LFU, binomial economic and Zipf-based eco-
nomic) were compared for the four scheduling algorithms,
Figure 3: LCG August 2004 grid topology. with 1000 jobs submitted to the grid. The mean job times
are shown in Figure 4. This shows that schedulers which
Fermilab) use case [8], with a total dataset size of 97 GB. LFU
25000
20000
5000
10
4 The investigation of possible replication and scheduling
algorithms is important for optimisation of resource usage
Mean Job Time (s)
10
3
and job throughput in HEP data grids, and simulation is
a useful tool for this. The grid simulator OptorSim has
been developed and experiments performed with various
2
10
grid topologies. These show that schedulers which con-
sider the availability of data at a site give lower job times.
10 For heavily-loaded grids with limited resources, it has been
shown that the economic models which have been devel-
1
oped begin to out-perform more traditional algorithms such
2 3 4
1 10 10 10 10
as the LFU as the number of jobs increases, whereas for
Number of Jobs
grids with an abundance of resources and a lighter load, a
simpler algorithm like the LFU may be better.
Figure 5: Mean job time for increasing number of jobs in OptorSim has given valuable results and is easily adapt-
CMS 2002 testbed. Points are displaced for clarity. able to new scenarios and new algorithms. Future work will
include continued experimentation with different site poli-
cies, job submission patterns and file sizes, in the context
LCG August 2004 testbed of a complex grid such as LCG.
The pattern of results for the scheduling algorithms in
the LCG testbed (Figure 6) is similar to that for CMS. The ACKNOWLEDGMENTS
Access Cost and Queue Access Cost algorithms are in this This work was partially funded by the European Com-
case indistinguishable, however, and the mean job time for mission program IST-2000-25182 through the EU Data-
the LFU algorithm is negligibly small. This is due to the GRID Project, the ScotGrid Project and PPARC. The au-
fact that in this case, CERN (which contains all the mas- thors would also like to thank Jamie Ferguson for the de-
ter files) has a CE. When a scheduler is considering access velopment of the OptorSim GUI.
costs, CERN will have the lowest cost and the job will be
sent there. This is also a grid where the storage resources
REFERENCES
[1] The European DataGrid Project, http://www.edg.org
Eco Bin
[2] D. Cameron, R. Carvajal-Schiaffino, P. Millar, C. Nichol-
6000 Eco Zipf
son, K. Stockinger and F. Zini, “ Evaluating Scheduling and
LFU
Replica Optimisation Strategies in OptorSim”, Journal of
5000 Grid Computing (to appear)
[3] K. Ranganathan and I. Foster, “ Identifying Dynamic Repli-
Mean Job Time (s)
4000
cation Strategies for a High Performance Data Grid”, Proc.
of the Int. Grid Computing Workshop, Denver, Nov. 2001
3000
[4] W. Bell, D. Cameron, L. Capozza, P. Millar, K. Stockinger
and F. Zini, “ OptorSim - A Grid Simulator for Studying
2000
Dynamic Data Replication Strategies”, Int. J. of High Per-
formance Computing Applications 17 (2003) 4
1000