Download presentation
Presentation is loading. Please wait.
Published byClementine Greer Modified over 8 years ago
1
Technical computing for science and industry Fabrizio Gagliardi Microsoft Corporation Fabrizio Gagliardi Microsoft Corporation
2
Outline Introductory remarks and few words on history Reviewing emergence of e_Science the intensive computing side the massive data side The opportunity of e_Science The challenges of e_Science A Microsoft contribution Conclusions
3
Introductory remarks Who am I? A computer scientist who has spent 30 years at CERN (and in other scientific laboratories) developing HPC systems for physics and other sciences Started in real-time, data acquisition and networking Pioneered ES, AI, MPP systems, cluster computing and in the last 7 years, Grid computing Initiator of EU-DataGrid, EGEE and more than 10 other HPC and Grid projects (mostly within the EU IST programmes) Co-founder of the Global Grid Forum (started in Amsterdam in 2001 together with EU-DataGrid) See my last article on IEEE Spectrum Magazine (July 2006)
4
Introductory remarks 2 Joined Microsoft on 1/November/2005 Promoting Microsoft Computing into Science and Science into Microsoft Computing My mission: Promoting Microsoft Computing into Science and Science into Microsoft Computing by exploring and building important collaborations with science in Europe, Middle East, Africa and Latin America EMEA and LATAM Director for Technical Computing
5
A short history of Grid schools Grids started in Europe at CERN around 1999-2000 and in Italy about the same time (CHEP 2000 in Padua) HEP was desperate for internationally distributed computing resources and additional non HEP funding HEP computing and the data access model is “simple” and a perfect fit to Grid (HEP is Grid computing par excellence said Ian Foster in Padua) First EU Flag Ship project (EU-DataGrid) proposed in 2000 and started on March 1, 2001 together with first GGF conference in Amsterdam Gino Nicolais’s visit to CERN in 2001 and CERN computing school in Vico Equense in 2002 First International Grid school in Vico in 2003
6
A short history of Grid schools
9
A short history of Grid schools 2 The international school of Grid computing continued in Vico in 2004 and 2005, then a EU project took over in 2006 with a school in Ischia and moved to Sweden this year (http://www.iceage-eu.org/issgc07/index.cfm)http://www.iceage-eu.org/issgc07/index.cfm GILDA and the Grid activity in Catania took a leading role in EDG and EGEE training and test activity Good work of outreach and dissemination towards industry in Catania (promoted and led by Roberto Barbera) led to the establishment of the COMETA consortium and to the interest and support of Microsoft Some interesting pioneering activity for the porting of gLite to Windows in the context of the MS sponsored CXP evaluation Microsoft prime sponsor of this year “First International Grid school for industrial applications”
10
Microsoft and e_Science
11
A New Science Paradigm Thousand years ago: Experimental Science - description of natural phenomena - description of natural phenomena Last few hundred years: Theoretical Science - Newton’s Laws, Maxwell’s Equations … - Newton’s Laws, Maxwell’s Equations … Last few decades: Computational Science - simulation of complex phenomena - simulation of complex phenomena Today: e-Science or Data-centric Science - unify theory, experiment, and simulation - unify theory, experiment, and simulation - using massive computing and large data - using massive computing and large data exploration and mining: exploration and mining: Data captured by instruments Data captured by instruments Data generated by simulations Data generated by simulations Data generated by sensor networks Data generated by sensor networks Scientists mostly work on computers (With thanks to Jim Gray)
12
Life Sciences Multidisciplinary Research New Materials, Technologies & Processes Math and Physical Science Social Sciences Earth Sciences Computer & Information Sciences Accelerating Discovery
13
13 CERN LHC 40 million particle collisions every second reduced by online computers to a few hundred “good” events per sec. Which are recorded on disk and magnetic tape at 100-1,000 MegaBytes/sec ~15 PetaBytes per year for all four experiments
14
14 Technology evolution has helped… 199119982005 System Cray Y-MP C916Sun HPC10000Small Form Factor PCs Architecture 16 x Vector 4GB, Bus 24 x 333MHz Ultra- SPARCII, 24GB, SBus 4 x 2.2GHz Athlon64 4GB, GigE OS UNICOSSolaris 2.5.1Windows Server 2003 SP1 GFlops~10 Top500 # 1500N/A Price $40,000,000$1,000,000 (40x drop)< $4,000 (250x drop) Customers Government LabsLarge EnterprisesEvery Engineer & Scientist Applications Classified, Climate, Physics Research Manufacturing, Energy, Finance, Telecom Bioinformatics, Materials Sciences, Digital Media
15
Top 500 Architectures / Systems
16
Enabling Grids for E-sciencE INFSO-RI-508833 16 LCG depends on two major science Grid infrastructures (plus regional Grids) EGEE - Enabling Grids for E-Science OSG - US Open Science Grid High Energy Physics (LCG) Scale (June 2006): ~ 200 sites in 40 countries ~ 25 000 CPUs > 10 PB storage > 35 000 jobs per day > 100 Virtual Organizations
17
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 17 Grids in Biomedical Sciences A multiplication of projects around the world –Example: the National Bioinformatics Initiative in Holland The example of EGEE –More than 20 applications in medical imaging, bioinformatics and drug discovery –Large scale deployment of in silico drug discovery initiatives binding energy docking energy T01 (E119A) T01 energy statistics 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 0 kcal/mol number Docking Energy Binding Energy 1f8b, 1f8c 2qwe 55% 11.58% binding energy docking energy Kcal/mol compound numbers T01 (E119A) T01 energy statistics 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 0 kcal/mol number Docking Energy Binding Energy 1f8c 2qwe 55% 11.58% binding energy docking energy Kcal/mol compound numbers Impact of mutations on drug efficiency against H5N1 In Silico Docking On Malaria on 5 grid infrastructures is breaking the the world record for in silico docking throughput
18
18 Future ITER Fusion reactor Applications with distributed calculations: Monte Carlo, Separate estimates, … Multiple Ray Tracing: e. g. TRUBA Stellarator Optimization: VMEC Transport and Kinetic Theory: Monte Carlo Codes
19
19 The data deluge e_Science is now dominated by huge amounts of data Many discoveries are hidden in those data, but… How to organize, mine and understand the data? How to address the above issues in a scientist friendly environment, this is where commodity computing tools developed by Microsoft for business and industry could help…
20
© 20 Data, Data, Data Courtesy of Carole Goble
21
© 21 Courtesy of Carole Goble
22
22 The opportunity in e_Science Replacing experimental activity (or part of it) with computing simulation and modelling based on large distributed computing infrastructures is what is now called e_Science Allowing sharing of resources, not only computing, but also data and people’s knowledge is what motivated the emergency of grid computing and the establishment of international virtual organisations which replace local resident scientists This is major paradigm shift which requires scientists to become expert in complex computing methods
23
23 The challenges (still) in e_Science The applied scientist is obliged to become also a computer scientist Far too much time is spent in developing often over engineered computing solutions distracting the applied scientist from their primary mission This has shifted the conventional scientific computing paradigm and could limit scientific discovery in the future and produce major set backs The applied scientist is obliged to become also a computer scientist Far too much time is spent in developing often over engineered computing solutions distracting the applied scientist from their primary mission This has shifted the conventional scientific computing paradigm and could limit scientific discovery in the future and produce major set backs
24
24 The Problem for the e-Scientist Data ingest Managing Petabytes Common schemas How to organize it? How to reorganize it? How to coexist & cooperate with others? Data Query and Visualization tools Support/training Performance Execute queries in a minute Batch (big) query scheduling Experiments & Instruments Simulations facts answers questions ? Literature Other Archives facts
25
25 Can “Here and Now” technologies accelerate discovery? Can “Business” Tools and techniques for dealing with be used in scientific research to allow researchers to be scientists and not computer scientists…
26
26 Computational Modeling Real-world Data Interpretation & Insight Persistent Distributed Data Workflow, Data Mining & Algorithms
27
27 Computational Modeling Real-world Data Interpretation & Insight Persistent Distributed Data Workflow, Data Mining & Algorithms
28
28 Conclusion We need to advance in making computing easy to use for the scientists to concentrate their energy on their science rather than on the computing tools Only in this way e_Science will be successful in accelerating discovery and producing new breakthroughs Microsoft is investigating solutions in collaborations with leading scientists around the world with its Technical Computing Initiative We need to advance in making computing easy to use for the scientists to concentrate their energy on their science rather than on the computing tools Only in this way e_Science will be successful in accelerating discovery and producing new breakthroughs Microsoft is investigating solutions in collaborations with leading scientists around the world with its Technical Computing Initiative
29
29 Technical Computing @ Microsoft Mission Statement: ‘Promoting Computing into Science and Science into Computing’
30
30 Four ‘Pillars’ of Technical Computing @ Microsoft Commitment to Science Global Collaboration Technology Excellence Interoperability
31
31 Technical Computing at Microsoft Advanced Computing for Science and Engineering Application of new algorithms, tools and technologies to scientific and engineering problems High Productivity Computing Application of high performance clusters, information worker tools and database technologies to industrial and scientific applications Radical Computing Research in potential breakthrough technologies
32
32 Fighting HIV with Computer Science A major problem: Over 40 million infected Drug treatments are effective but are an expensive life commitment Vaccine needed for third world countries Effective vaccine could eradicate disease Methods from computer science are helping with the design of vaccine Machine learning: Finding biological patterns that may stimulate the immune system to fight the HIV virus Optimization methods: Compressing these patterns into a small, effective vaccine
33
MICROSOFT SPONSORED RESEARCH AT THE CENTER FOR BIOINFORMATICS AND GENOME BIOLOGY AND THE FUNDACION CIENCIA PARA LA VIDA, CHILE Courtesy of David Holmes
34
34 Technical Computing and HPC Collaboration with MS HPC product groups complement and extend MS HPC institutes Some examples: HPC for Aerospace at Southampton Cancer research, financial and climate modeling at Oxford OeRC HPC for automotive industry at HLRS Stuttgart HPC support to computational system biology at MSRC joint centre with University of Trento in Italy
35
35 HPC Innovation Centers Center Cornell Theory Center Ithaca, NY USA University of Tennessee Knoxville, TN USA TACC – University of Texas Austin, TX USA University of Virginia Charlottesville, VA USA University of Utah Salt Lake City, UT USA Tokyo Institute of Technology Tokyo, Japan HLRS – University of Stuttgart Stuttgart, Germany Southampton University Southampton, UK Shanghai Jiao Tong University Shanghai, PRC Nizhni Novgorod University Nizhni Novgorod, Russia Cornell Theory Center Ithaca, NY USA University of Tennessee Knoxville, TN USA University of Virginia Charlottesville, VA USA University of Utah Salt Lake City, UT USA TACC – University of Texas Austin, TX USA Southampton University Southampton, UK HLRS – University of Stuttgart Stuttgart, Germany Shanghai Jiao Tong University Shanghai, PRC Tokyo Institute of Technology Tokyo, Japan Nizhni Novgorod University Nizhni Novgorod, Russia Microsoft HPC Institutes
36
36 10,0001,000100101 ‘70‘80‘90‘00‘10 Power Density (W/cm 2 ) 4004 8008 8080 8085 8086 286 386 486 Pentium ® Hot Plate Nuclear Reactor Rocket Nozzle Sun’s Surface Intel Developer Forum, Spring 2004 - Pat Gelsinger
37
37 Radical Computing: The End of Moore’s Law? Future of silicon chips “100’s of cores on a chip in 2015” (Justin Rattner, Intel) (Justin Rattner, Intel) Challenge for IT industry and Computer Science community Can we make parallel computing on a chip easier than message-passing? Challenge for the Scientific Community How will the Multi-Core transition affect scientific computing?
38
38 Radical Computig @ BSC Major collaboration at the Barcelona Super Computer Centre (Prof. Mateo Valero) on development of S/W environment for support of Many- multicore architectures in collaboration with Microsoft Research in Cambridge
39
39 Summary Microsoft wishes to work with the university research and business communities to: Microsoft wishes to work with the university research and business communities to: develop interoperable high-level services, work flows, tools and data services (make computing easy) develop interoperable high-level services, work flows, tools and data services (make computing easy) accelerate progress in a small number of societally important scientific applications (make a difference) accelerate progress in a small number of societally important scientific applications (make a difference) explore radical new directions in computing and ways and applications to exploit on-chip parallelism explore radical new directions in computing and ways and applications to exploit on-chip parallelismwww.microsoft.com/science
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.