Some (biased) observations Miguel Branco. Disclaimer The views in this set of slides are my own, and not necessarily endorsed by my employer, employees,

Slides:



Advertisements
Similar presentations
MOSS 2007 Document Management Adam McCarthy 1 st April 2009.
Advertisements

CANHEIT | On the EDGE | June 15-18, 2008 | University of Calgary Collaborative Computing on an Institutional Level Steve Breeck, Harold Esche, Bill Richardson.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
©2012 EPMWorks. All rights reserved..
CERN Summary Ian Bird eInfrastructure Workshop 9 December, 2003.
Ch 7-1 Working with workgroups-1. Objectives Working with workgroups Creating a workgroup Determining whether to use centralized or group sharing.
@cosmickated. Digital Collaboration: the experience and the challenges Kate Doodson - Cosmic.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
INFO 310 User Centered Design. User centered design (Allen, 1996) Identify a user population Investigate the information needs of the user group Discover.
IT’s Gone Mobile: How to do your Job Anywhere Jason Hand IT Specialist, Central NM Electric Cooperative Jason Hand Cell:
IMT530- Organization of Information Resources1 Feedback Like exercises –But want more instructions and feedback on them –Wondering about grading on these.
Control of Personal Information in a Networked World Rebecca Wright Boaz Barak Jim Aspnes Avi Wigderson Sanjeev Arora David Goodman Joan Feigenbaum ToNC.
Software Developer By: Charlie Edwards Period 6 th Mrs. Truong.
Long-term Archive Service Requirements draft-ietf-ltans-reqs-00.txt.
Managing LOB Applications by Using System Center Operations Manager Published: March 2007.
Data Management Needs and Challenges for Telemetry Scientists Josh M London Wildlife Biologist, Polar Ecosystems Program National Marine Mammal Laboratory.
Coaching through coaching Exploring The Journey from HR to Coach Peter Mayes
Christian Weyer thinktecture.
A Robust Health Data Infrastructure P. Jon White, MD Director, Health IT Agency for Healthcare Research and Quality
Financial Information Management Managing Financial Information Critical Thinking Business Process Modeling WINIT Control Structures Homework.
Google Confidential and Proprietary 1 Google and the Enterprise Paul Souza New England Director of Sales - Enterprise Google Inc.
Introduction Our Topic: Mobile Security Why is mobile security important?
VKT-GÖPL, IDIRA, L4S ISCM th March 2012 Christian Flachberger Collaboration in Complex Crisis Management Operations.
Using Microsoft ACCESS to develop small to medium applications on campus.
Current Situation and CI Requirements OOI Cyberinfrastructure Integrated Observatory Management Workshop San Diego May 28-29, 2008.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
Remote Pair Programming Agile 2013 Johannes Brodwall Exilesoft Chief Guest starring: Niruka Ruhunage.
Feedback from day 1. Today I learned... The degree to which non-academic resources impact student success. The degree to which non-academic resources.
WLE Information Management. Discussion points  What systems do we have?  Which to use for what purpose?  What information is missing and can be improved.
Google Apps in Classrooms and Schools 32 Ways to Use Google Apps in 50 Minutes Julia Stiglitz Google Apps for Education
Google-Assisted Language Learning and Teaching 2013 Summer Workshop Pukyong National University Instructor: Anthony Schmidt Website:
Delivering business value through Context Driven Content Management Karsten Fogh Ho-Lanng, CTO.
Directory Services Best Practices Ed Reed, Technologist Novell, Inc.
Copyright © 2015 – Curt Hill Version Control Systems Why use? What systems? What functions?
Internet2 Middleware Initiative. Discussion Outline  What is Middleware why is it important why is it hard  What are the major components of middleware.
2009 NOAC TRAININGTHE POWER OF ONE Passing it on Electronically Jason Coe.
Current Situation and CI Requirements OOI CyberInfrastructure Science User Requirements Workshop: San Diego January 23-24, 2008.
Billy Hollis Consultant / Author Next Version Systems WUX205.
The MSR-UR Curriculum Repository Tom Healy Lead Program Manager Microsoft Research University Relations.
Model View Controller A Pattern that Many People Think They Understand, But Has A Couple Meanings.
BLISS Problem Statement Jonathan Rosenberg Cisco.
Farewell Address Some reflections after 4 years in the job TERENA Networking Conference Zagreb 22 May 2003 David Williams, CERN and TERENA President.
Electronic labnotes Mari Wigham COMMIT/. Information WUR  Organising, sharing, finding and reusing data  Expertise in: ● Modelling data.
SIP PUBLISH draft-ietf-simple-publish-01 Aki Niemi
What is it? CLOUD COMPUTING.  Connects to the cloud via the Internet  Does computing tasks, or  Runs applications, or  Stores Data THE AVERAGE CLOUD.
Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Federations: The New Infrastructure Speaker Name Here Date Here Speaker Name Here Date Here.
Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.
E-Mission + Team of undergraduates = ??? Background and motivation.
SharePoint Governance And the role of the Site Owner.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI UMD Roadmap Steven Newhouse 14/09/2010.
Microsoft Virtual Academy Jamie McAllister | SharePoint MVP & Solution Architect Rob Latino | Program Manager in Office 365 Support.
ADP Product Suite Integration – New Hire Workflow
SuperComputing 2003 “The Great Academia / Industry Grid Debate” ?
Set up your own Cloud The search for a secure and acceptable means of gaining access to your files stored at the office from a remote location.
The importance of being Connected
National e-Infrastructure Vision
Cloud Storage - an introduction
Microsoft Dynamics.
Rights Management Services (RMS)
API Documentation Guidelines
Microsoft 365 Business Customer Targeting 2/6/18
File Stream and Team Drives
Automation in an XML Authoring Environment
11/19/2018 4:38 AM Microsoft 365 Business Customer Targeting Janine Brittain - EXEED 2/6/18 © Microsoft Corporation. All rights reserved. MICROSOFT.
Saravana Kumar CEO/Founder - Kovai Atomic Scope – Product Update.
Google File Stream Google Drive Updates.
Presentation transcript:

Some (biased) observations Miguel Branco

Disclaimer The views in this set of slides are my own, and not necessarily endorsed by my employer, employees, former CERN colleagues or friends.

Disclaimer The views in this set of slides are my own, and not necessarily endorsed by my employer, employees, former CERN colleagues or friends. Information is provided “AS IS” without warranty of any kind. The information could include technical inaccuracies and/or typographical errors.

Disclaimer The views in this set of slides are my own, and not necessarily endorsed by my employer, employees, former CERN colleagues or friends. Information is provided “AS IS” without warranty of any kind. The information could include technical inaccuracies and/or typographical errors. Changes are periodically made to the information herein; these changes may (or may not) be incorporated in future editions of this publication.

My background I spent 6 years at CERN doing my best, but in practice, really making physicists’ life miserable.

Things are moving fast … lots of testing! … lots of work! … lots of ideas! Some convergence of the underlying stacks: OpenStack & friends nearly ubiquitous A lot of ownCloud

Open Source Nothing but open source. Nothing new here, but that’s an important message for vendors. Core software open source Higher-end features open source Everything open source!

A comment on “tech focus” 1) SysAdmins seem to want a “Dropbox” that “I” can manage 2) Storage people want to provide “Ease-to-use Storage” I’ve seem more of (1) in talks (a more accessible problem right now?) But (1) and (2) are NOT the same thing. Suggest you consider explicitly where you are focusing. Focus on what the science workflow needs. Where are scientists spending 90% of their time? Is it sync’ing Word?

A comment on “tech focus” 1) SysAdmins seem to want a “Dropbox” that “I” can manage 2) Storage people want to provide “Ease-to-use Storage” I’ve seem more of (1) in talks (a more accessible problem right now?) But (1) and (2) are NOT the same thing. Suggest you consider explicitly where you are focusing. Focus on what the science workflow needs. Where are scientists spending 90% of their time? Is it sync’ing Word? Google Drive is 2y old. How will doc handling look like in 2y? Will Word look like it does today? At EPFL, some 18y CS students do not know what “Office” is. They all know Google Docs. (And GitHub.)

A comment on “tech focus” 1) SysAdmins seem to want a “Dropbox” that “I” can manage 2) Storage people want to provide “Ease-to-use Storage” I’ve seem more of (1) in talks (a more accessible problem right now?) But (1) and (2) are NOT the same thing. Suggest you consider explicitly where you are focusing. Focus on what the science workflow needs. Where are scientists spending 90% of their time? Is it sync’ing Word? Google Drive is 2y old. How will doc handling look like in 2y? Will Word look like it does today? At EPFL, some 18y CS students do not know what “Office” is. They all know Google Docs. (And GitHub.) Vendor tech is !=. Large vs small-ish files? Updates: delta sync? Peer-to-peer: locality? Low latency? High throughput? Security? Scope: metadata? Provenance? Future features: notifications: push VS pull?

Verification Sync is complicated How good is good enough? From Word files to mission-critical data? A Higgs analysis from data stored on “xxxBox” Mission-critical data in the scientific context is likely read-only (easier problem) … but datasets are bunches of files & if some missing, results biased (hard problem). So, well, this is not obvious. “Academic” view: bugs/weirdness OK (they are obligatory) as long as understood My gut feeling: if your efforts focus on “scientific mission-critical data”, then verification is the distinguishing factor to “other” solutions “For our requirements – see here – our system works exactly as expected”

Security: Authentication, Authorization, Accounting Academic research groups really are hierarchical. So are companies, so whatever companies use, should work for you as well. Except, well, companies are directed acyclic graphs and science collaborations are not These issues were mentioned in passing in nearly all talks, but inconclusive. A non-trivial problem to address.

Federation Another dimension of complexity. Banks are also federated. So whatever works for them… Ah no. Science often needs to crosscut admin/bureaucratic barriers. A “no no” in banking. How do you handle this? Intuitively, it is huge: “scientists” ‘freely’ sharing data, cross institution

Encryption Client-side encryption??? Needed? Not needed? Depends on what is the data being stored.

CERNbox Representative of the various efforts around. I appreciate the focus on “look-and-feel”! Convergence would be beneficial This stuff is hard – particularly the last 20% -, so not point in diverging. Like every other system presented, I have my reservations: Trying to support multiple, seemingly widely different backends / use cases Assumingly a 90% solution. Fair enough, but which 90% exactly?

Let’s sync & share what we learn! Workshop also indirectly really useful to: Document systems’ behavior (in presentations) Learn experiences; deployment models under consideration (future “best practices”) [ Do more of this; particularly the templates Massimo/Jakub sent around ] Future collaboration themes: Dev side: Verification, Testing infrastructures (Deterministic even?) Policy side: AAA (Authentication, Authorization, Accounting), Federation SysAdmin side: Deployment experiences, Interoperability w/ legacy

Analysis on top of file sharing Baby steps still – e.g. Ganga Potentially game changer to how physicists work (I’ve seen the alternative… it is NOT pretty) Exciting to me to have this as the “underlying infrastructure”, because it enables the #reallycoolstuff on top. Interactive analysis Share code, data, plots, histograms, results, reports together Storage that gives you meaningful data Suggest you spend time looking into this explicitly: it is a game changer.

And what about you? What are your views?