Presentation on theme: "OptorSim: A Replica Optimisation Simulator for the EU DataGrid W. H. Bell, D. G. Cameron, R. Carvajal, A. P. Millar, C.Nicholson, K. Stockinger, F. Zini."— Presentation transcript:
OptorSim: A Replica Optimisation Simulator for the EU DataGrid W. H. Bell, D. G. Cameron, R. Carvajal, A. P. Millar, C.Nicholson, K. Stockinger, F. Zini 2 nd Meeting of the EPSRC Pilot Projects NeSC, 30 th January 2003
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects The Project Resource-independent and application-independent services E. g., authentication, authorization, resource location, resource allocation, accounting, optimized data access, data replication, grid monitoring, Grid policies, fault detection Grid Services (Middleware) : Resource-specific implementations of basic services E.g., Transport protocols, name servers, CPU schedulers, site accounting, local policies, directory service, data storage, data access Grid Fabric (Resources) Virtual Organisations High Energy Physics (CERN)Earth Observation (ESA)Biology (CNRS) Data- Intensive Applications Toolkit Application Toolkits Distributed Computing Toolkit Problem Solving Applications Toolkit ……
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects Scheduling Jobs on the Grid Scheduler The Grid Site 1 Site 2 Site 3 User
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects Replica Optimisation Optimise use of computing, storage and network resources. Short term optimisation: - Minimise running time of current job. -Get me the files for my job as quickly as possible. Long term optimisation: - Minimise running time of all jobs. -Make sure files are in the best places for all my future jobs. => Test optimisation algorithms with a grid simulator – OptorSim Various replication strategies possible
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects OptorSim: a Replica Optimisation Simulator Simulate prototype particle physics grid, e.g. EDG, GridPP. Inputs are: - site policies - experiment data files - available resources (CPU, network bandwidth, storage) - file access patterns (sequential, random, unitary random walk, Gaussian random walk) Optimisation algorithms tested are: - no replication - always replicate, delete oldest file - always replicate, delete least valuable - economic model (with and without auctions). EDG Testbed Sites
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects Results for EDG Configuration So far, experiments conducted with EDG sites. Any replication is better than none! Results show economic model is at least 10% faster, for sequential access patterns, without auctions. Economic model tuned for sequential access so this is as expected. Using auction mechanism looks even more promising!
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects Building up Realism: GridPP Previous tests used smaller-than- life-size files and datasets, with no background traffic We can get a more realistic picture of how the Grid will look for GridPP - 6 experiments, 22 sites - predicted available CPUs & storage - realistic file sizes (1GB) and dataset sizes (1TB) - realistic number of jobs (~60 users) - inclusion of background network traffic
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects Background Network in GridPP Available bandwidth (Mbits/sec) per day, averaged over up to 3 months. Measurements of actual available bandwidth between various UK EDG sites. Iperf 1 data gathered from e- science monitoring pages 2 and the GridNM monitoring service run by Yee-Ting Li, UCL 3. ~10 – 90 % of bandwidth available, depending on link Diurnal variation is apparent but usually insignificant. 1 http://dast.nlanr.net/Projects/Iperf/ 2 http://gridmon.ucs.ed.ac.uk/gridmon/http://gridmon.ucs.ed.ac.uk/gridmon/ 3 http://www.hep.ucl.ac.uk/~ytl/monitoring/gridnm/gridnm-client.html
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects Results With Background Traffic Preliminary results show that including background traffic slows the mean time per job by about 10%. Awaiting further results…
Caitriana Nicholson 2nd Meeting of the EPSRC Pilot Projects The Future Run tests with planned GridPP resources and more realistic HEP use cases Get real file access patterns, e.g. SAM Further tuning of algorithms Integrate algorithms into EDG testbed Available for download at http://grid-data-management.web.cern.ch/grid- data-management/optimisation/optor/ OptorSim DEMO!
Your consent to our cookies if you continue to use this website.