Presentation is loading. Please wait.

Presentation is loading. Please wait.

Service and Support for Science IT Lunchveranstaltung Herbst 2014.

Similar presentations


Presentation on theme: "Service and Support for Science IT Lunchveranstaltung Herbst 2014."— Presentation transcript:

1 Service and Support for Science IT Lunchveranstaltung Herbst 2014

2 S 3 IT Peter Kunszt PhD in Theoretical Physics (University of Bern) Postdoc in Astrophysics / Cosmology, Johns Hopkins University Baltimore, USA: Sloan Digital Sky Survey Science Archive (Virtual Observatory), 3Y CERN IT Department, Geneva, Data Management Section head and Project Manager for EU projects, 5Y CSCS, Lugano: Build Swiss Tier 2 for CERN, Swiss Grid Initiative, 3Y ETH Zürich: Head of SyBIT Projects for SystemsX.ch, 5Y UZH: Heading S 3 IT

3 S 3 IT Outline Science is changing Challenges due to the change Addressing the challenge Infrastructure Organization of S3IT

4 S 3 IT A Digital World Scientific Discovery driven by new instrumentation

5 S 3 IT Scientific data doubles every year Changes the nature of scientific computing Cuts across disciplines …. eScience It becomes increasingly difficult to extract knowledge An Exponential World Slide by Alex Szalay, JHU

6 S 3 IT Not only scientific data! 20% of the world’s servers go into centers by the “Big 5” –Google, Microsoft, Yahoo, Amazon, eBay An Exponential World Slide by Alex Szalay, JHU

7 S 3 IT Science is Changing THOUSAND YEARS AGO science was empirical describing natural phenomena LAST FEW HUNDRED YEARS theoretical branch using models, generalizations LAST FEW DECADES a computational branch simulating complex phenomena TODAY data intensive science, synthesizing theory, experiment and computation with statistics ►new way of thinking required! Slide by Alex Szalay, JHU

8 S 3 IT Change of Culture Single person discoveriesLarge Collaborations

9 S 3 IT Change of Culture Single person discoveriesLarge Collaborations Citizen Science

10 S 3 IT Luckily, there’s Moore’s Law

11 S 3 IT Moore’s Law – no mo(o)re?

12 S 3 IT Scientific Data Analysis Today Data is produced everywhere, never will be at a single location Data grows as fast as our computing/instrument power Many labs have their own power-workstation or mini-cluster Hitting the cooling and power wall Moore’s law is not as it used to be – more complex solutions Trouble storing even the produced data stream Not scalable, not maintainable…

13 S 3 IT Fire and forget... Often, you do not want to be bothered with computing details IT JUST NEEDS TO WORK!

14 S 3 IT Widening Complexity Gap Standard Computing GAPResearch Needs Desktop computing, storage Helpdesk, support Internet, Wikipedia,.. Algorithms Models, Statistics Visualizations Data analysis Publication Local IT Resources Central IT Services Research laboratories Core Facilities

15 S 3 IT Challenge: Scale Up High Throughput Instruments –Much larger data volumes –Increased data complexity Large Collaborations –More people –More experiments and measurements –More coverage BIG Everything

16 S 3 IT Science IT Connect IT and Science Dedicated support for computations and data analysis SPEED : faster time to solution ACCESS to competitive infrastructure ENABLE : remove barriers new possibilities Speed Access Enablement

17 S 3 IT Sure, we all believe in miracles...

18 S 3 IT Gray’s Laws of Data Engineering Jim Gray: Scientific computing is increasingly about data Need scale-out solution for analysis Take the analysis to the data! Start with “20 queries” Go from “working to working” Slide from Alex Szalay, JHU

19 S 3 IT What does that mean? 1.Understand your data = understand your problem 2.Reduce data wherever possible – think about what is worth keeping vs. reproducing 3.Focus on doable chunks of work and questions 4.Build programs and systems that can scale

20 S 3 IT And: No one size fits all – all research is different by nature

21 S 3 IT Providing the ‚Miracle‘: A lot of this is Scientific Work by itself! Computer Science Scientific Computing, Research Informatics Data Science Department of Informatics Institute for Computational Science Domain Informatics – Bioinformatics, Medical Informatics, Geoinformatics,.... Lots of PhD theses to come!

22 S 3 IT Results of that research needs to be applied! Software engineering Code optimization and scaling Visualization Applied statistical analysis Automation of workloads Data storage and management Maintenance ! ! ! Students don‘t have time for that and don‘t get any recognition

23 S 3 IT Like a Formula 1 Racing Team

24 S 3 IT Informatikdienste Teamwork in Science IT Core Facilities Domain Informatics Domain Informatics Research Groups Projects Research Groups Faculties Institutes Departments Faculties Institutes Departments Faculties Institutes Departements Faculties Institutes Departements Department of Informatics CSCS external internal Vendors Industrial Partners Institute for Computer Science Core Facilities Science IT @ ETH, Univ of X Zentrale Informatik Core Facilities Local IT Support

25 S 3 IT Science IT as a Service BOOTSTRAP: Consultancy Research context and perspective Categorization of problem in terms of Simulation, Data, Processing, Publication Map to available infrastructure Plan Support Service as a Project (time, cost,..) DELIVERY: Project execution Setup of infrastructure, software, integration Automation, analysis, visualization Training of the users on the workflow Continuous Support, feedback and iterative improvement FINISH Conclusion and publication Reusability and sustainability measures

26 S 3 IT Categorization of Infrastructure Computer is the Resesarch Instrument : ‚Supercomputing‘ –Simulations of phenomena –Needs the largest computers you can get –Theoretical physics, astrophysics, mathematic, computational chemistry, biochemistry, meteorology –Simulations also generate a lot of data, models. –Continuous usage. Our job: Provide access to necessary Infrastructure –Support –Maintenance –Software optimization –Data storage and handling

27 S 3 IT Categorization of Infrastructure Computer is a tool, a workhorse –Statistical analysis, parameter studies –‚Big Data‘ processing –Visualization –Life science, biochemistry, geography, medicine, digital humanities, banking and finance, computer science,... –Very heterogeneous requirements –Non-continuous use Can be large! 1.Server computing –Interactive work, person sitting in front of the system 2.Cluster computing –Automated workloads, many computers at once

28 S 3 IT Categorization of Infrastructure 1.Server computing –Interactive work, person sitting in front of the system 2.Cluster computing –Automated workloads, many computers at once Our job: Provide access to individualized, custom servers and clusters –Assure scalability –Keep costs down –Maintenance, support –Automation tooling, workflows –Data management and data processing –Standardized processes

29 S 3 IT Supercomputing UZH operates local supercomputing since almost 10 years Irchel Datacenter Supported and operated centrally BUT: Getting ‚old‘ – over 4 years Not competitive Not power-efficient Expensive in maintenance

30 S 3 IT The Science Cloud Scale by re-centralizing individual local infrastructure –One hardware size fits all –But individual delivery of clusters and servers! –Possible due to virtualization Physical Hardware ‚Virtual‘ Simulated Hardware

31 S 3 IT Schrödinger Supercomputer Local Computing Infrastructure Today Maintained by S3IT / ZI Supported by S3IT / ZI Standardized Tools Locally installed and maintained Own tools and developments

32 S 3 IT UZH Science Cloud Maintained by S3IT / ZI Supported by S3IT / ZI Toolset : both standard and own tools UZH HPC@CSCS Local Computing Infrastructure 2015 Maintained by CSCS Supported by S3IT / ZI Standardized Toolset Locally maintained Own tools and developments

33 S 3 IT UZH HPC@CSCS Local Computing UZH Science Cloud Infrastructure 2016 Maintained by CSCS Supported by S3IT Locally maintained Own tools and developments Maintained by S3IT Supported by S3IT

34 S 3 IT Storage Storage Storage

35 S 3 IT Cost of a Petabyte is not the problem From backblaze.com October 2014

36 S 3 IT Read Petabytes in short time is! Current problems are not easy to parallelize and scale! 10-30TB ‘easy’ 100-200TB doable 500TB+ very difficult Moving 100TB over the network (sequentially) 1Gbps – 10 days 10Gbps – 1 day : but needs a dedicated connection! Physically? – FedEx

37 S 3 IT Categorization of Science IT Problems Scaling Autom. Share Publish Analysis Mgmt Models Theory SimulationData Processing Publication

38 S 3 IT Managing the Data Lifecycle Conscious data production, reduction, usage Ask the questions you need to ask as quickly as possible Delete data wherever possible, keep starting points (freezer?) Aim for reproducibility on all levels on first principles – document steps Automated data processing pipelines Scale by automation Ability to re-run simulations and data analysis on a push of a button Extract and keep all metadata Publication and archiving Publish everything to be easily reproducible by others Archive only what you need or what is mandatory

39 S 3 IT Setup of Service and Support for Science IT Director InfrastructureProjectsCollaborations Office HPC Cloud Storage Life Science Geoscience Physics Humanities... National International Industrial

40 S 3 IT Setup of Service and Support for Science IT Informatikdienste Zentrale Informatik Vice President Law & Economics

41 S 3 IT Organisation Core Team Site Team EE EE EE EE EE... EE = Embedded Experts Project work directly in the groups. Site Teams Joining forces with local institute-IT experts. Support and provisioning of access and software on site. Core Team Consultancy in core competences, central infrastructure, project management

42 S 3 IT Funding University Core Funding for the core team Site Teams: Funding through local institutes, faculties or 3rd party projects – service charges Embedded Experts: 3rd party projects – co-applicants on project proposals

43 S 3 IT What to expect of S 3 IT Science IT as a Service - Consulting Support through projects Software, Workflows, Infrastructure is optimized to the needs of the research problem, not the other way round Our operational concept follows the cloud model to meet the very heterogeneous needs of the UZH research groups, while working with standardized, commodity hardware and software Scalability, Extensibility, Reusability are our guiding principles

44 S 3 IT Already a success story Started with 1 person in January 10 people now, 5 on core, 5 on other funds 20 project applications written, 10 projects already approved and running, 10 projects in the pipeline, some already pre- approved Many new project ideas as a result of consultation Projects include Life Science (imaging, genomics, proteomics) HPC optimization Digital humanities, art history,.. Industrial cooperations

45 S 3 IT What next? How long does the data growth continue? How far can we scale? What new technologies will make our life easier or harder? Let‘s find out together

46 S 3 IT Please visit us Next Lunch Events Generating Hypotheses from Large Data: Lars Malmström, S3IT Big Data in Art History: Thoms Hänsli, Ditigal Art History UZH Visual Analytics, understanding interactions: Markus Grau; Business Alliances, Guido Oswald; SAS Kognitive Systems, how ‚artificial intelligence‘ will revolutionalize research and education – Karin Fey, IBM Cloud Services at the UZH – ZI http://www.s3it.uzh.ch/


Download ppt "Service and Support for Science IT Lunchveranstaltung Herbst 2014."

Similar presentations


Ads by Google