Presentation is loading. Please wait.

Presentation is loading. Please wait.

Technical computing for science and industry Fabrizio Gagliardi Microsoft Corporation Fabrizio Gagliardi Microsoft Corporation.

Similar presentations


Presentation on theme: "Technical computing for science and industry Fabrizio Gagliardi Microsoft Corporation Fabrizio Gagliardi Microsoft Corporation."— Presentation transcript:

1 Technical computing for science and industry Fabrizio Gagliardi Microsoft Corporation Fabrizio Gagliardi Microsoft Corporation

2 Outline Introductory remarks and few words on history Reviewing emergence of e_Science the intensive computing side the massive data side The opportunity of e_Science The challenges of e_Science A Microsoft contribution Conclusions

3 Introductory remarks Who am I? A computer scientist who has spent 30 years at CERN (and in other scientific laboratories) developing HPC systems for physics and other sciences Started in real-time, data acquisition and networking Pioneered ES, AI, MPP systems, cluster computing and in the last 7 years, Grid computing Initiator of EU-DataGrid, EGEE and more than 10 other HPC and Grid projects (mostly within the EU IST programmes) Co-founder of the Global Grid Forum (started in Amsterdam in 2001 together with EU-DataGrid) See my last article on IEEE Spectrum Magazine (July 2006)

4 Introductory remarks 2 Joined Microsoft on 1/November/2005 Promoting Microsoft Computing into Science and Science into Microsoft Computing My mission: Promoting Microsoft Computing into Science and Science into Microsoft Computing by exploring and building important collaborations with science in Europe, Middle East, Africa and Latin America EMEA and LATAM Director for Technical Computing

5 A short history of Grid schools Grids started in Europe at CERN around 1999-2000 and in Italy about the same time (CHEP 2000 in Padua) HEP was desperate for internationally distributed computing resources and additional non HEP funding HEP computing and the data access model is “simple” and a perfect fit to Grid (HEP is Grid computing par excellence said Ian Foster in Padua) First EU Flag Ship project (EU-DataGrid) proposed in 2000 and started on March 1, 2001 together with first GGF conference in Amsterdam Gino Nicolais’s visit to CERN in 2001 and CERN computing school in Vico Equense in 2002 First International Grid school in Vico in 2003

6 A short history of Grid schools

7

8

9 A short history of Grid schools 2 The international school of Grid computing continued in Vico in 2004 and 2005, then a EU project took over in 2006 with a school in Ischia and moved to Sweden this year (http://www.iceage-eu.org/issgc07/index.cfm)http://www.iceage-eu.org/issgc07/index.cfm GILDA and the Grid activity in Catania took a leading role in EDG and EGEE training and test activity Good work of outreach and dissemination towards industry in Catania (promoted and led by Roberto Barbera) led to the establishment of the COMETA consortium and to the interest and support of Microsoft Some interesting pioneering activity for the porting of gLite to Windows in the context of the MS sponsored CXP evaluation Microsoft prime sponsor of this year “First International Grid school for industrial applications”

10 Microsoft and e_Science

11 A New Science Paradigm  Thousand years ago: Experimental Science - description of natural phenomena - description of natural phenomena  Last few hundred years: Theoretical Science - Newton’s Laws, Maxwell’s Equations … - Newton’s Laws, Maxwell’s Equations …  Last few decades: Computational Science - simulation of complex phenomena - simulation of complex phenomena  Today: e-Science or Data-centric Science - unify theory, experiment, and simulation - unify theory, experiment, and simulation - using massive computing and large data - using massive computing and large data exploration and mining: exploration and mining: Data captured by instruments Data captured by instruments Data generated by simulations Data generated by simulations Data generated by sensor networks Data generated by sensor networks  Scientists mostly work on computers (With thanks to Jim Gray)

12 Life Sciences Multidisciplinary Research New Materials, Technologies & Processes Math and Physical Science Social Sciences Earth Sciences Computer & Information Sciences Accelerating Discovery

13 13 CERN LHC 40 million particle collisions every second reduced by online computers to a few hundred “good” events per sec. Which are recorded on disk and magnetic tape at 100-1,000 MegaBytes/sec ~15 PetaBytes per year for all four experiments

14 14 Technology evolution has helped… 199119982005 System Cray Y-MP C916Sun HPC10000Small Form Factor PCs Architecture 16 x Vector 4GB, Bus 24 x 333MHz Ultra- SPARCII, 24GB, SBus 4 x 2.2GHz Athlon64 4GB, GigE OS UNICOSSolaris 2.5.1Windows Server 2003 SP1 GFlops~10 Top500 # 1500N/A Price $40,000,000$1,000,000 (40x drop)< $4,000 (250x drop) Customers Government LabsLarge EnterprisesEvery Engineer & Scientist Applications Classified, Climate, Physics Research Manufacturing, Energy, Finance, Telecom Bioinformatics, Materials Sciences, Digital Media

15 Top 500 Architectures / Systems

16 Enabling Grids for E-sciencE INFSO-RI-508833 16 LCG depends on two major science Grid infrastructures (plus regional Grids) EGEE - Enabling Grids for E-Science OSG - US Open Science Grid High Energy Physics (LCG) Scale (June 2006): ~ 200 sites in 40 countries ~ 25 000 CPUs > 10 PB storage > 35 000 jobs per day > 100 Virtual Organizations

17 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 17 Grids in Biomedical Sciences A multiplication of projects around the world –Example: the National Bioinformatics Initiative in Holland The example of EGEE –More than 20 applications in medical imaging, bioinformatics and drug discovery –Large scale deployment of in silico drug discovery initiatives binding energy docking energy T01 (E119A) T01 energy statistics 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 0 kcal/mol number Docking Energy Binding Energy 1f8b, 1f8c 2qwe 55% 11.58% binding energy docking energy Kcal/mol compound numbers T01 (E119A) T01 energy statistics 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 0 kcal/mol number Docking Energy Binding Energy 1f8c 2qwe 55% 11.58% binding energy docking energy Kcal/mol compound numbers Impact of mutations on drug efficiency against H5N1 In Silico Docking On Malaria on 5 grid infrastructures is breaking the the world record for in silico docking throughput

18 18 Future ITER Fusion reactor Applications with distributed calculations: Monte Carlo, Separate estimates, … Multiple Ray Tracing: e. g. TRUBA Stellarator Optimization: VMEC Transport and Kinetic Theory: Monte Carlo Codes

19 19 The data deluge e_Science is now dominated by huge amounts of data Many discoveries are hidden in those data, but… How to organize, mine and understand the data? How to address the above issues in a scientist friendly environment, this is where commodity computing tools developed by Microsoft for business and industry could help…

20 © 20 Data, Data, Data Courtesy of Carole Goble

21 © 21 Courtesy of Carole Goble

22 22 The opportunity in e_Science Replacing experimental activity (or part of it) with computing simulation and modelling based on large distributed computing infrastructures is what is now called e_Science Allowing sharing of resources, not only computing, but also data and people’s knowledge is what motivated the emergency of grid computing and the establishment of international virtual organisations which replace local resident scientists This is major paradigm shift which requires scientists to become expert in complex computing methods

23 23 The challenges (still) in e_Science The applied scientist is obliged to become also a computer scientist Far too much time is spent in developing often over engineered computing solutions distracting the applied scientist from their primary mission This has shifted the conventional scientific computing paradigm and could limit scientific discovery in the future and produce major set backs The applied scientist is obliged to become also a computer scientist Far too much time is spent in developing often over engineered computing solutions distracting the applied scientist from their primary mission This has shifted the conventional scientific computing paradigm and could limit scientific discovery in the future and produce major set backs

24 24 The Problem for the e-Scientist Data ingest Managing Petabytes Common schemas How to organize it? How to reorganize it? How to coexist & cooperate with others?  Data Query and Visualization tools  Support/training  Performance  Execute queries in a minute  Batch (big) query scheduling Experiments & Instruments Simulations facts answers questions ? Literature Other Archives facts

25 25 Can “Here and Now” technologies accelerate discovery? Can “Business” Tools and techniques for dealing with be used in scientific research to allow researchers to be scientists and not computer scientists…

26 26 Computational Modeling Real-world Data Interpretation & Insight Persistent Distributed Data Workflow, Data Mining & Algorithms

27 27 Computational Modeling Real-world Data Interpretation & Insight Persistent Distributed Data Workflow, Data Mining & Algorithms

28 28 Conclusion We need to advance in making computing easy to use for the scientists to concentrate their energy on their science rather than on the computing tools Only in this way e_Science will be successful in accelerating discovery and producing new breakthroughs Microsoft is investigating solutions in collaborations with leading scientists around the world with its Technical Computing Initiative We need to advance in making computing easy to use for the scientists to concentrate their energy on their science rather than on the computing tools Only in this way e_Science will be successful in accelerating discovery and producing new breakthroughs Microsoft is investigating solutions in collaborations with leading scientists around the world with its Technical Computing Initiative

29 29 Technical Computing @ Microsoft Mission Statement: ‘Promoting Computing into Science and Science into Computing’

30 30 Four ‘Pillars’ of Technical Computing @ Microsoft Commitment to Science Global Collaboration Technology Excellence Interoperability

31 31 Technical Computing at Microsoft Advanced Computing for Science and Engineering Application of new algorithms, tools and technologies to scientific and engineering problems High Productivity Computing Application of high performance clusters, information worker tools and database technologies to industrial and scientific applications Radical Computing Research in potential breakthrough technologies

32 32 Fighting HIV with Computer Science A major problem: Over 40 million infected Drug treatments are effective but are an expensive life commitment Vaccine needed for third world countries Effective vaccine could eradicate disease Methods from computer science are helping with the design of vaccine Machine learning: Finding biological patterns that may stimulate the immune system to fight the HIV virus Optimization methods: Compressing these patterns into a small, effective vaccine

33 MICROSOFT SPONSORED RESEARCH AT THE CENTER FOR BIOINFORMATICS AND GENOME BIOLOGY AND THE FUNDACION CIENCIA PARA LA VIDA, CHILE Courtesy of David Holmes

34 34 Technical Computing and HPC Collaboration with MS HPC product groups complement and extend MS HPC institutes Some examples: HPC for Aerospace at Southampton Cancer research, financial and climate modeling at Oxford OeRC HPC for automotive industry at HLRS Stuttgart HPC support to computational system biology at MSRC joint centre with University of Trento in Italy

35 35 HPC Innovation Centers Center Cornell Theory Center Ithaca, NY USA University of Tennessee Knoxville, TN USA TACC – University of Texas Austin, TX USA University of Virginia Charlottesville, VA USA University of Utah Salt Lake City, UT USA Tokyo Institute of Technology Tokyo, Japan HLRS – University of Stuttgart Stuttgart, Germany Southampton University Southampton, UK Shanghai Jiao Tong University Shanghai, PRC Nizhni Novgorod University Nizhni Novgorod, Russia Cornell Theory Center Ithaca, NY USA University of Tennessee Knoxville, TN USA University of Virginia Charlottesville, VA USA University of Utah Salt Lake City, UT USA TACC – University of Texas Austin, TX USA Southampton University Southampton, UK HLRS – University of Stuttgart Stuttgart, Germany Shanghai Jiao Tong University Shanghai, PRC Tokyo Institute of Technology Tokyo, Japan Nizhni Novgorod University Nizhni Novgorod, Russia Microsoft HPC Institutes

36 36 10,0001,000100101 ‘70‘80‘90‘00‘10 Power Density (W/cm 2 ) 4004 8008 8080 8085 8086 286 386 486 Pentium ® Hot Plate Nuclear Reactor Rocket Nozzle Sun’s Surface Intel Developer Forum, Spring 2004 - Pat Gelsinger

37 37 Radical Computing: The End of Moore’s Law? Future of silicon chips “100’s of cores on a chip in 2015” (Justin Rattner, Intel) (Justin Rattner, Intel) Challenge for IT industry and Computer Science community Can we make parallel computing on a chip easier than message-passing? Challenge for the Scientific Community How will the Multi-Core transition affect scientific computing?

38 38 Radical Computig @ BSC Major collaboration at the Barcelona Super Computer Centre (Prof. Mateo Valero) on development of S/W environment for support of Many- multicore architectures in collaboration with Microsoft Research in Cambridge

39 39 Summary Microsoft wishes to work with the university research and business communities to: Microsoft wishes to work with the university research and business communities to: develop interoperable high-level services, work flows, tools and data services (make computing easy) develop interoperable high-level services, work flows, tools and data services (make computing easy) accelerate progress in a small number of societally important scientific applications (make a difference) accelerate progress in a small number of societally important scientific applications (make a difference) explore radical new directions in computing and ways and applications to exploit on-chip parallelism explore radical new directions in computing and ways and applications to exploit on-chip parallelismwww.microsoft.com/science


Download ppt "Technical computing for science and industry Fabrizio Gagliardi Microsoft Corporation Fabrizio Gagliardi Microsoft Corporation."

Similar presentations


Ads by Google