An iRODS-based Distributed Data Management System for CyberSKA Cameron Kiddle, Arne Grimstrup, Russ Taylor – University of Calgary Venkat Mahadevan, Erik.

Slides:



Advertisements
Similar presentations
Texas Digital Library Services Preservation Network.
Advertisements

Challenges and Achievements Presented by Cameron Kiddle Research Fellow, Grid Research Centre, University of Calgary.
Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
Data Grids Jon Ludwig Leor Dilmanian Braden Allchin Andrew Brown.
A Computation Management Agent for Multi-Institutional Grids
14 October 2003ADASS 2003 – Strasbourg1 Resource Registries for the Virtual Observatory R.Plante (NCSA), G. Greene (STScI), R. Hanisch (STScI), T. McGlynn.
Depositing e-material to The National Library of Sweden.
Solar and STP Physics with AstroGrid 1. Mullard Space Science Laboratory, University College London. 2. School of Physics and Astronomy, University of.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
Mgt 240 Lecture Website Construction: Software and Language Alternatives March 29, 2005.
An Introduction to DuraCloud Carissa Smith, Partner Specialist Michele Kimpton, Project Director Bill Branan, Lead Software Developer Andrew Woods, Lead.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
Construction of efficient PDP scheme for Distributed Cloud Storage. By Manognya Reddy Kondam.
Valma Technical Aspects
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Geoff Payne ARROW Project Manager 1 April Genesis Monash University information management perspective Desire to integrate initiatives such as electronic.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Enabling E Research ANU Data Commons. What is it ? Building a repository for data sets o data can be deposited o updated o published to Research Data.
An On-line Collaborative Data Management System Roger Curry 1, Cameron Kiddle 1, Rob Simmonds 1 and Gilberto Z. Pastorello Jr. 2 1 Grid Research Centre,
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
material assembled from the web pages at
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
F. Toussaint (WDCC, Hamburg) / / 1 CERA : Data Structure and User Interface Frank Toussaint Michael Lautenschlager World Data Center for Climate.
DDN & iRODS at ICBR By Alex Oumantsev History of ICBR  Campus wide Interdisciplinary Center for Biotechnology Research  Core Facility  Funded by the.
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
XML Registries Source: Java TM API for XML Registries Specification.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Grid Computing Research Lab SUNY Binghamton 1 XCAT-C++: A High Performance Distributed CCA Framework Madhu Govindaraju.
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
2-Tier,3-Tier datawarehouse Submitted by Manisha Dubey & Akanksha Agrawal.
DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.
Copenhagen, 7 June 2006 Toolkit update and maintenance Anton Cupcea Finsiel Romania.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
ABone Architecture and Operation ABCd — ABone Control Daemon Server for remote EE management On-demand EE initiation and termination Automatic EE restart.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Distributed Data for Science Workflows Data Architecture Progress Report December 2008.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
ESG-CET Meeting, Boulder, CO, April 2008 Gateway Implementation 4/30/2008.
Globus and ESGF Rachana Ananthakrishnan University of Chicago
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
Distributed Archives Interoperability Cynthia Y. Cheung NASA Goddard Space Flight Center IAU 2000 Commission 5 Manchester, UK August 12, 2000.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
Brian Corrie Technical Lead, iReceptor Technical Director, IRMACS Centre Simon Fraser University Services for Distributed Data, Security and Computation.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
A Universe of Data on your Desktop Russ Taylor, Bob Este, Cameron Kiddle University of Calgary CyberSKA.
CRISP WP 17 1 / 2 Proposed Metadata Catalogue Architecture Document.
CyberSKA CANARIE NEP-107 CANARIE RPI/NEP workshop March
Proposal of Satellite Data Center India Meteorological Department A.K.Sharma (Chairman), Virendera Singh (Member), R.K.Giri (Member) and N.Puviarasan (Member.
INFSO-RI Enabling Grids for E-sciencE GOCDB2 Matt Thorpe / Philippa Strange RAL, UK.
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal.
CyberSKA: Global Federated e-Infrastructure
Joseph JaJa, Mike Smorul, and Sangchul Song
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
Microsoft Azure Makes It Simple to Integrate Data between Dynamics CRM and Web Portal MINI-CASE STUDY “We needed to deliver our customer a robust and.
Azure's Performance, Scalability, SQL Servers Automate Real Time Data Transfer at Low Cost MINI-CASE STUDY “Azure offers high performance, scalable, and.
Presentation transcript:

An iRODS-based Distributed Data Management System for CyberSKA Cameron Kiddle, Arne Grimstrup, Russ Taylor – University of Calgary Venkat Mahadevan, Erik Rosolowsky – University of British Columbia Okanagan Olivier Eymere – IBM Canada

CyberSKA CyberSKA: Creating the cyberinfrastructure to support what will be the largest radio-telescope ever built: the Square Kilometre Array (SKA) IDIA 2011 (Green Bank) May 2 - 6, Artists impression of the core of the SKA Image credit: SPDO / Swinburne Astronomy Productions

CyberSKA  Initiative to develop a scalable and distributed cyberinfrastructure platform to meet evolving science needs of current and future radio telescopes such as the SKA  Led by the University of Calgary with several partner institutions from North America currently  Canadian funding for CyberSKA provided by CANARIE, as part of their Network Enabled Platforms (NEP) program, and Cybera  Starting by establishing cyberinfrastructure to support current large-scale astrophysical data needs generated by GALFACTS, PALFA and other high data volume SKA Pathfinder projects CyberSKA IDIA 2011 (Green Bank) May 2 - 6,

CyberSKA High Level Architecture IDIA 2011 (Green Bank) May 2 - 6,

CyberSKA Data Management System  Based on iRODS (Integrated Rule-Oriented Data System) Abstracts data location from user Supports data replication and cross-site backups Efficient transfer of data using multiple TCP streams Rule Engine to automate various tasks IDIA 2011 (Green Bank) May 2 - 6,

CyberSKA Storage Sites IDIA 2011 (Green Bank) May 2 - 6, UBC UBCO* UofC* WestGrid McGill *currently deployed

CyberSKA User Interface IDIA 2011 (Green Bank) May 2 - 6,  User interface provided via CyberSKA Portal which is built on Elgg open source social networking platform  Distributed Data management system interfaces directly into Elgg File Module  User interface no different than if a local Elgg filestore was being used

CyberSKA  User authenticates with portal via login/password  Third party application authenticates with portal via OAuth (Portal exposes a RESTful API)  Access permissions for files maintained by portal (in Elgg database) Permissions can be set to private, logged in, contacts, public, contact list, group  Portal authenticates with Data Management Service on behalf of user / application via Oauth (Data Management Service exposes a RESTful API) – user/application redirected to Data Management Service Authentication/Authorization IDIA 2011 (Green Bank) May 2 - 6,

CyberSKA  Title, description, tags, file permissions, iRODS file handle for files of all types stored in portal (Elgg) database  Extra metadata for certain file types (currently just FITS files) stored in separate metadata database associated with the data management system Spatially enabled PostgreSQL/PgSphere database Various metadata from FITS file extracted upon ingestion Metadata schema based on IVOA Resource Metadata recommendations  Data Query Service enables spatial, temporal and spectral queries Metadata IDIA 2011 (Green Bank) May 2 - 6,

CyberSKA  Simple file upload/download Use normal Elgg Upload/Download mechanisms  Single file at a time  Suitable for smaller uploads/downloads  Advanced upload/download tools – for more efficient, reliable transfers of larger files Use JUpload for uploads  Java applet  Supports http and ftp  Allows for multi-file upload Use CADC (Canadian Astronomy Data Centre) Download Manager for downloads  Java Web Start based  Supports http File Upload / Download IDIA 2011 (Green Bank) May 2 - 6,

CyberSKA Status and Next Steps  Status: Prototype environment currently deployed at two sites (UofC, UBCO)  Next Steps: Complete testing Integrate with production portal Set up storage nodes at other participating sites Add support for Measurement Sets in metadata database Add IVOA compliant interface IDIA 2011 (Green Bank) May 2 - 6,

CyberSKA Contact Information IDIA 2011 (Green Bank) May 2 - 6, Portal: