Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rafał Słota, Michał Wrzeszcz, Renata G. Słota, Łukasz Dutka, Jacek Kitowski ACC Cyfronet AGH Department of Computer Science, AGH - UST CGW 2015 Kraków,

Similar presentations


Presentation on theme: "Rafał Słota, Michał Wrzeszcz, Renata G. Słota, Łukasz Dutka, Jacek Kitowski ACC Cyfronet AGH Department of Computer Science, AGH - UST CGW 2015 Kraków,"— Presentation transcript:

1 Rafał Słota, Michał Wrzeszcz, Renata G. Słota, Łukasz Dutka, Jacek Kitowski ACC Cyfronet AGH Department of Computer Science, AGH - UST CGW 2015 Kraków, Poland, October 26-28, 2015 Efficient Storing of Metadata for Distributed Data Management

2 Distributed data management in global environment onedata System’s description Data and metadata organization Metadata challenges in onedata Analyzed solutions Proposed solution Performance tests Conclusions Agenda

3 28.10.15 Managing data over different storage solution in globally dispersed environments is hot topic. Global data management challenges are investigated by many research and commercial groups. Distributed Data Management in Global Environment

4 28.10.15 Onedata is a distributed data management system that virtualizes access to organizationally distributed data and hides environment’s complexity where there is no trust between resources providers. Data and metadata organization is a key to provide: easy view on data for each user, automatic data management for better efficiency. Onedata – overall description

5 28.10.15 Direct access whenever possible Management of blocks’ replicas to minimize delays Caching, prefetching and fast parallel transport Onedata – work in distributed environment

6 Data organization Spaces Logical files ProvidersStorages Users Groups Logical files organization via spaces separates users from problems connected with resources and data locations’ management.

7 Results of data organization design Easy management and sharing of data for users. Limitation of metadata that each provider stores and processes.

8 Metadata organization 3 levels of metadata for data organization and usage description 1.Metadata used to coordinate providers’ cooperation 2.Files metadata stored by each provider 3.Current usage metadata Usage optimization Lower level -> more frequent usage -> higher distribution

9 Metadata challenges in onedata Too slow storing of metadata when all metadata is stored on disk Risk of loosing important metadata when metadata is saved only in memory Examples: metadata that describes location of actual data file has to be persistent metadata that describes the way files are used by current sessions should be - at most - available as long as the session is active and be available extremely fast

10 Various solutions In-memory vs. persistent databases Standalone vs. build-in applications Examples: Mnesia, Redis, Riak, Couchbase, Cassandra No solution with all 3 features: Safety High throughput (many operations per seconds) Low delay Analysed solutions

11 Proposed solution - datastore Models API that defines how specific types of metadata should be stored (e.g. in global memory) Stores Elements where data is kept Worker with API Set of functionalities for data access optimization

12 Datastore key features and examples Dynamic Cache System Datastore allows to set one store as cache for other Reads and writes are done on cache Writes are aggregated and done asynchronous Dynamic load/unload of data from cache when needed Hooks for models cooperation Separation of models Easy reaction for other models actions Exemplary models: file_meta, session, task_pool

13 Performance tests Speed vs. risk of metadata loss Cache as compromise

14 Conclusions For systems that globalize data access, efficient metadata management is key element. Proposed datastore provides flexible, efficient and safe solution for storing of metadata. Proposed datastore allows onedata to provide data access in a globally distributed environment.

15 Thank you onedata homepage: http://www.onedata.org See also: Łukasz Dutka, Michał Wrzeszcz, Tomasz Lichoń, Rafał Słota, Konrad Zemek, Krzysztof Trzepla, Łukasz Opioła, Renata Słota, and Jacek Kitowski. Onedata - a Step Forward towards Globalization of Data Access for Computing Infrastructures, ICCS 2015 Computational Science at the Gates of Nature, Procedia Computer Science, volume 51, pages 2843–2847. 2015. M. Wrzeszcz, T. Lichoń, R. Słota, K. Zemek, K. Trzepla, Ł. Opioła, D. Nikolow, Ł. Dutka, R. Słota and J. Kitowski, Metadata Organization and Management for Globalization of Data Access with onedata, PPAM 2015 : book of abstracts, 2015, pp. 31 MichałWrzeszcz,ŁukaszDutka,RenataSłota,andJacekKitowski.VeilFS-AnewfaceofStorage as a Service. In eChallenges e-2014, 2014 Conference, pages 1–10, Oct 2014. Łukasz Dutka, Renata Słota, Michał Wrzeszcz, Dariusz Król, and Jacek Kitowski. Uniform and Efficient Access to Data in Organizationally Distributed Environments. eScience on Distributed Computing Infrastructure, volume 8500 of Lecture Notes in Computer Science, pages 178–194. Springer International Publishing, 2014. Słota,R., Dutka,Ł., Wrzeszcz,M. Kryza,B., Nikolow,D., Król, D., Kitowski, J.: Storage Management Systems for Organizationally Distributed Environments - PLGrid PLUS Case Study. Lecture Notes in Computer Science, Vol. 8384, 2014, pp. 724–733.


Download ppt "Rafał Słota, Michał Wrzeszcz, Renata G. Słota, Łukasz Dutka, Jacek Kitowski ACC Cyfronet AGH Department of Computer Science, AGH - UST CGW 2015 Kraków,"

Similar presentations


Ads by Google