We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byAdolfo Atlee
Modified over 3 years ago
© 2012 IBM Corporation 1 ENSURE: Enabling kNowledge Sustainability, Usability and Recovery for Economic value Presenter: Michael Factor firstname.lastname@example.org The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 270000
© 2011 IBM Corporation 2 Enabling kNowledge Sustainability, Usability and Recovery for Economic value 3 4 INNOVATIONSUSE CASES Healthcare Clinical Studies Financial Services EVALUATE Cost and Value AUTOMATE Preservation Lifecycle SCALE using ICT innovations PROTECT Content-aware data protection A 3-year IP project started Feb 2011 www.ensure-fp7.eu
© 2011 IBM Corporation 3 ENSURE: Key Technical Innovations Evaluate Automate Scale Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation
© 2011 IBM Corporation 4 ENSURE: Key Technical Innovations Evaluate Automate Scale Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation
© 2012 IBM Corporation 5 5 Evaluate Cost and Value – InputEvaluate Cost and Value – Output
© 2012 IBM Corporation 6 Evaluate Cost and Value – Process Configurator Economic Performance Engine Preservation Plan Optimizer Translation Rules Quality Engine Cost/risk Engine Data Repositories Configuration Selection Administrator Requirements (Re)Deploy Solution ENSURE Automate
© 2012 IBM Corporation 7 Evaluate cost and value: Preservation Plan Optimizer COE QOE Genetic algorithm generates results based upon engines Really n-dimensions The user chooses a solution from the Pareto frontier No dimension can be improved without degrading at least one other dimension Quality Cost
© 2012 IBM Corporation 8 ENSURE: Key Technical Innovations Evaluate Automate Scale Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation
© 2012 IBM Corporation 9 Automate Preservation Lifecycle: Preservation Data Aware Lifecycle Management (PDALM) Workflow Engine 9 PDALM: Controls system activities –Manage workflow of the information being preserved –Execute preservation plan (built by the Configurator) –Handle notifications and interaction with the administrator Example: Workflow for ingest
© 2012 IBM Corporation 10 Automate Preservation Lifecycle: Event engine Configurator Event Engine Manages, concurrency, priority and impact/severity of events Listens for preservation related events Notifies relevant ENSURE components PDALM Monitored system behavior Economic Data/format Regulatory Standards Feeds Scale
© 2012 IBM Corporation 11 Automate preservation lifecycle: ontology update Select ontology to update Upload a new version and display potential system impacts Apply new ontology and update system
© 2012 IBM Corporation 12 ENSURE: Key Technical Innovations Evaluate Automate Scale Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation
© 2012 IBM Corporation 13 Scale: What is a cloud, why is it interesting, and what are the issues? “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources … that can be rapidly provisioned and released with minimal management effort or service provider interaction.” –US National Institute of Standards and Technology, Information Technology Laboratory Benefits Cost Savings –Economies of scale, utilization improvement and standardization Speed and Agility Pay-as-you-go for usage Issues for preservation Rich metadata support, e.g., no search Differences in security models Encryption may limit preservation actions Compute near the storage (storlets) Logical connections among objects in the same and different clouds Standards Enterprise A Enterprise B Enterprise C Community Cloud Services User A User BUser C User D User E Public Cloud Services Enterprise Data Center Private Cloud Cloud Delivery Models
© 2012 IBM Corporation 14 Map OAIS AIPs and the links among AIPs to the cloud data model Manage object’s inter-relationship and referential integrity Map objects to one or more clouds Scale: Mapping Data to Multiple Clouds Cloud A Cloud B Protect
© 2012 IBM Corporation 15 Request to access content with VA Instantiate VA Compute Cloud Private Application Library Storage Cloud Extract content Into VA ENSURE Give user access to VA with content Scale: Accessing Content with a Virtual Appliance (VA)
© 2012 IBM Corporation 16 ENSURE: Key Technical Innovations Evaluate Automate Scale Protect Requirements EvaluateAutomateScaleProtect Access Deploy External Events Flow Events Ontology Cost Value Quality Cloud Virtual appliance Anonymi- zation
© 2012 IBM Corporation 17 Content-aware data protection: Masked/Anonymized Data Data Owner Requirement: –Data should be anonymized and cannot be associated with a specific individual Example: –Living people from London who fought in WWII is becoming more and more identifiable hospital bank factory Data Receivers Data Owners Telco Medical Research Software testing Statistical Analysis Pharma Research Full data Masked data Masking Services
© 2012 IBM Corporation 18 Summary Architect and build the next generation preservation system, ensuring knowledge is sustained and can be recovered for future value Key Innovations: –Evaluate Cost and Value supporting business decisions –Automate Preservation Lifecycle –Scale using ICT innovations –Content-aware data protection Three use cases to demonstrate future preservation –Healthcare, clinical trials, and finance use Status –Initial end to end demo of two use cases in the first year –Emphasis on evolution along time for the second year www.ensure-fp7.eu
© 2012 IBM Corporation 19 Thank You
© 2012 IBM Corporation 20 Backup
© 2012 IBM Corporation 21 Open Archival Information System (OAIS) ISO:14721:2002 Functional Model Information Model SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package Archival Information Package
© 2012 IBM Corporation 22 What is a cloud and why is it interesting? “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” –US National Institute of Standards and Technology, Information Technology Laboratory Key features: On-demand Shared Automated Network access Benefits Cost Savings –Economies of scale, utilization improvement and standardization Speed and Agility Pay-as-you-go for usage Investment per GB vs. Quantity of Information
© 2012 IBM Corporation 23 Source:”Cloud will Transform Business as We Know It: The Secret’s in the Source”, Hfs Research, and the London School of Economics, December, 2010 How Much of a Concern are the Following Business Risks Posed by Cloud Business Services to your Business Function, Compared to Your Existing Risks for Non-Cloud Business Services? Security, privacy, lack of control in data placement, lock-in and compliance are key concerns with cloud
© 2012 IBM Corporation 24 I BM’s five cloud delivery models Enterprise owned Either enterprise operation or 3 rd party Fixed price or time and materials services Internal network Dedicated assets 3 rd party owned and operated Centralized, secure delivery center Fixed price, time and materials, or pay as you go Internal network Dedicated assets Mix of shared and dedicated resources Shared facility and staff Pay as you go VPN access or public internet Shared resources Elastic scaling Pay as you go Public internet Enterprise Data Center Private Cloud Enterprise Data Center IBM operated Managed Private Cloud IBM owned and operated Hosted Private Cloud User A User BUser C User D User E Public Cloud Services Enterprise A Enterprise B Enterprise C Shared Cloud Services 12345 Community Clouds should be considered by memory institutions
© 2012 IBM Corporation 25 Scale: Cloud Gap Analysis Clouds considered –Amazon S3 and EC2 (enterprise) –Open Stack Swift and Nova (open source) –VISION Cloud (EC research) Some common shortcomings for long term preservation –Limited support of user metadata –Lack of support for searches on metadata –Differences in supported security models –Encryption models limit preservation actions –Lack of compute near the storage support –Lake of support for logical connections among objects in the same and different clouds
© 2012 IBM Corporation 26 Scale: Computational Storage Cloud storage generally utilizes: –server-based storage with powerful CPUs –Serves big data accessed from anywhere over the WAN –==> add computational modules (storlets) to the cloud storage What is a storlet? –Restricted module performed in the storage close to the data Why/ When use storlets? –Reduce bandwidth –Security – reduce exposure of private data –Preservation – data in storage may change and be more up-to-date –Expose generic functions that can be used by many applications Example Storlets: –Transformation –Annonymization –Data Mining –Fixity check –Encryption/Secure delete
© 2012 IBM Corporation 27 Scale: Use of Open Standards and Open Source jClouds (open source) to access multiple clouds Cloud Data Management Interface (CDMI) (standard interface) for cloud access and management –Contribute CDMI support to jCloudes OpenStack Swift (open source) as private cloud infrastructure
© 2012 IBM Corporation 28 Content-aware data protection: Vocabulary of an Access Policy Who are the actors (doctor, nurse, gynecologist,...) What are the actions they can take (create, read, append, update,...) What are the data objects that are subject to access policies (PHR, GI, What are the purposes for which access is given (treatment, research, billing,...) What are the types of conditions mentioned in the access rules (time, place, consent,...) What types of obligations must be fulfilled before access is granted (external: notify, consent,...; data-related: anonymize,...) Actor has permission to take action on data object for the purpose under the conditions with obligations.
© 2012 IBM Corporation 29 Share data with changes: Data Owner Requirement: –Data should be anonymized and cannot be associated with a specific individual Example: –Living people from London who fought in WWII is becoming identifiable as years pass by. Content-aware data protection: Compromise hospital bank factory De-Identification Data Receiver Data Owner
Permanent access to the records of science: The e-Depot at the Koninklijke Bibliotheek Current Status & Developments Erik Oltmans Manager e-Depot Koninklijke.
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
Open Access Niamh Brennan Trinity College Dublin DRIVER Summit, Goettingen, January 17th 2008 Local Integration, National Federation TCD-RSS, TARA, IReL-Open,
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
Hello i am so and so, title/role and a little background on myself (i.e. former microsoft employee or anything interesting) set context for what going.
Distributed Data Processing
© 2007 IBM Corporation Enterprise Content Management Integrating Content, Process, and Connectivity for Competitive Advantage Malcolm Holden October 2007.
Fedora Users’ Conference Rutgers University May 14, 2005 Researching Fedora's Ability to Serve as a Preservation System for Electronic University Records.
RETHINK BACKUP & ARCHIVE. 2 Backup and Archive are Top IT Priorities Which of the following would you consider to be your org’s most important IT priorities.
Security, Privacy and the Cloud Connecticut Community Providers’ Association June 20, 2014 Steven R Bulmer, VP of Professional Services.
Clouds C. Vuerli Contributed by Zsolt Nemeth. As it started.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
11© 2011 Hitachi Data Systems. All rights reserved. HITACHI DATA DISCOVERY FOR MICROSOFT® SHAREPOINT ® SOLUTION SCALING YOUR SHAREPOINT ENVIRONMENT PRESENTER.
Changing the Economics of Innovation
Enterprise Architecture The Arkansas Approach. Key Areas What is enterprise architecture? Why is it important? How you can participate Current status.
Lesson 11-Virtual Private Networks. Overview Define Virtual Private Networks (VPNs). Deploy User VPNs. Deploy Site VPNs. Understand standard VPN techniques.
FI-WARE – Future Internet Core Platform FI-WARE Cloud Hosting July 2011 High-level description.
Chapter 12 Strategies for Managing the Technology Infrastructure.
© 2009 IBM Corporation ® IBM Software Group Introduction to Cloud Computing Vivek C Agarwal IBM India Software Labs.
© 2018 SlidePlayer.com Inc. All rights reserved.