Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ajou University, South Korea GCC 2003 Presentation Dynamic Data Grid Replication Strategy based on Internet Hierarchy Sang Min Park , Jai-Hoon Kim, and.

Similar presentations


Presentation on theme: "Ajou University, South Korea GCC 2003 Presentation Dynamic Data Grid Replication Strategy based on Internet Hierarchy Sang Min Park , Jai-Hoon Kim, and."— Presentation transcript:

1 Ajou University, South Korea GCC 2003 Presentation Dynamic Data Grid Replication Strategy based on Internet Hierarchy Sang Min Park , Jai-Hoon Kim, and Young-Bae Ko Ajou University South Korea

2 Ajou University, South Korea GCC 2003 Presentation 2 Contents Introduction to Data Grid Optimizations in Data Grid Novel Replication Strategy based on Internet Hierarchy Simulation Simulation Results Conclusions

3 Ajou University, South Korea GCC 2003 Presentation 3 Introduction to Data Grid Data Grid Motivations Petabyte scale data production Distributed data storage to store parts of data Distributed computing resources which process the data Two Most Important Approaches for Data Grid Secure, reliable, and efficient data transport protocol (ex. GridFTP) Replication (ex. Replica catalog) Replication Large size files are partially replicated among sites Reduce data access time Application Scheduling, Dynamic replication issues are emerging

4 Ajou University, South Korea GCC 2003 Presentation 4 Introduction to Data Grid Typical Job Execution Scenario

5 Ajou University, South Korea GCC 2003 Presentation 5 Optimizations in Data Grid Reducing the Overall Job Execution Time Scheduling Optimization Deciding where to allocate the job Considering location of replicas and computational capabilities of sites Short-term Optimization Deciding from where to fetch replicas Considering available network bandwidth between sites Long-term Optimization (Dynamic Replication Strategy) Shortage of storage in a site Deciding which file should be remaining as a replica Better to replicate popular files because of its future usage

6 Ajou University, South Korea GCC 2003 Presentation 6 Existing Dynamic Replication Strategies Replica Optimization based on Site-level Locality Replicate the file that is predicted to be used in future from the perspective of a site Try to reduce the number of fetch Delete Oldest, Delete LRU Method Economic Strategy from European Data Grid Developing OptorSim –Data Grid Optimization Simulator Using Auction Protocol to trigger Long-term Optimization Site-level Locality based on File access patterns

7 Ajou University, South Korea GCC 2003 Presentation 7 Existing Dynamic Replication Strategies The Limitations of the site-level optimization A Site certainly have limitations of their storage size, which means that the rate of data request locality is also limited There should be predictable file access patterns, but we do not know if there will be.

8 Ajou University, South Korea GCC 2003 Presentation 8 Replication Strategy based on Bandwidth Hierarchy (BHR) Network-level Locality A site is not the only possible source of locality Another source of locality : Network-level locality If the replica is located in a close site, not long delay would be taken to fetch this replica Fast Replica Transmission Slow Replica Transmission Network Region (e.g., a country)

9 Ajou University, South Korea GCC 2003 Presentation 9 Replication Strategy based on Bandwidth Hierarchy (BHR) Bandwidth Hierarchy

10 Ajou University, South Korea GCC 2003 Presentation 10 Replication Strategy based on Bandwidth Hierarchy (BHR) Maximizing Network-level locality 1. Avoiding Replica Duplication in a region 2. Considering popularity of file request at the region-level X X A Region Receiving New Replica a Site No space here! We should remove some file Delete this one! Replica X is duplicated here! A

11 Ajou University, South Korea GCC 2003 Presentation 11 Simulation OptorSim Data Grid Dynamic Replication Simulation tool Developed as part of European Data Grid Project Implemented in Java Implemented Our own Region-based Optimizer in OptorSim

12 Ajou University, South Korea GCC 2003 Presentation 12 Simulation Simulation Environment

13 Ajou University, South Korea GCC 2003 Presentation 13 Simulations ParametersValues Number of jobs1000 Number of job types50 Number of file accessed per job15 Size of single file1 GB Total size of files750 GB ParametersValues Intra-region bandwidth1000 Mbps Inter-region bandwidth1000 Mbps Master-router bandwidth2000 Mbps Storage space at site50 GB General configuration of parameters Bandwidth and Storage Size

14 Ajou University, South Korea GCC 2003 Presentation 14 Simulation Results Total Job times of three strategies

15 Ajou University, South Korea GCC 2003 Presentation 15 Simulation Results Total job time with varying bandwidth and storage size

16 Ajou University, South Korea GCC 2003 Presentation 16 Conclusions The existing dynamic replication strategies are based only on site-level locality of file request BHR strategy is based on the network-locality BHR shows quite good performance when hierarchy of bandwidth clearly appears, and size of storage at a site is small We extend current site-level replica optimization study to more scalable way

17 Ajou University, South Korea GCC 2003 Presentation 17 References William H. Bell, David G. Cameron, Luigi Capozza, A. Paul Millar, Kurt Stockinger, and Floriano Zini.: Simulation of Dynamic Grid Replication Strategies in OptorSim. In Proc. of the 3rd Int'l. IEEE Workshop on Grid Computing (Grid'2002), Baltimore, USA, November 2002. Springer Verlag, Lecture Notes in Computer Science. William H. Bell, David G. Cameron, Ruben Carvajal-Schiaffino, A. Paul Millar, Kurt Stockinger, and Floriano Zini.: Evaluation of an Economy-Based File Replication Strategy for a Data Grid. In International Workshop on Agent based Cluster and Grid Computing at CCGrid 2003, Tokyo, Japan, May 2003. IEEE Computer Society Press. Mark Carman, Floriano Zini, Luciano Serafini, and Kurt Stockinger.: Towards an Economy- Based Optimisation of File Access and Replication on a Data Grid. In International Workshop on Agent based Cluster and Grid Computing at International Symposium on Cluster Computing and the Grid (CCGrid'2002), Berlin, Germany, May 2002. IEEE Computer Society Press. Ann Chervenak, Ian Foster, Carl Kesselman, Charles Salisbury and Steven Tuecke.: The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets. Journal of Network and Computer Applications, 23:187-200, 2001. EU Data Grid Project: http://www.eu-datagrid.org

18 Ajou University, South Korea GCC 2003 Presentation 18 References I. Foster, C. Kesselman and S. Tuecke.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International J. Supercomputer Applications, 15(3), 2001. Wolfgang Hoschek, Javier Jaen-Martinez, Asad Samar, Heinz Stockinger and Kurt Stockinger.: Data Management in an International Data Grid Project. 1st IEEE/ACM International Workshop on Grid Computing (Grid'2000), Bangalore, India, Dec 2000. OptorSim – A Replica Optimizer Simulation: http://edg-wp2.web.cern.ch/edg- wp2/optimization/optorsim.html Sang-Min Park and Jai-Hoon Kim.: Chameleon: A Resource Scheduler in a Data Grid Environment. 2003 IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'2003), Tokyo, Japan, May 2003. IEEE Computer Society Press. Kavitha Ranganathan and Ian Foster.: Design and Evaluation of Dynamic Replication Strategies for a High Performance Data Grid. International Conference on Computing in High Energy and Nuclear Physics, Beijing, September 2001. Kavitha Ranganathan and Ian Foster.: Identifying Dynamic Replication Strategies for a High Performance Data Grid. International Workshop on Grid Computing, Denver, November 2001.


Download ppt "Ajou University, South Korea GCC 2003 Presentation Dynamic Data Grid Replication Strategy based on Internet Hierarchy Sang Min Park , Jai-Hoon Kim, and."

Similar presentations


Ads by Google