Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud Computing for e-Science with CARMEN Paul Watson Newcastle University.

Similar presentations


Presentation on theme: "Cloud Computing for e-Science with CARMEN Paul Watson Newcastle University."— Presentation transcript:

1 Cloud Computing for e-Science with CARMEN Paul Watson Newcastle University

2 e-Science “e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it” John Taylor Former Director General of the UK Research Councils

3 Two Strands to talk...

4 Research Challenge Understanding the brain is the greatest informatics challenge Enormous implications for science: Medicine Biology Computer Science

5 Collecting the Evidence 100,000 neuroscientists generate huge quantities of data – molecular (genomic/proteomic) – neurophysiological (time-series activity) – anatomical (spatial) – behavioural

6 Neuroinformatics Problems Data is: expensive to collect but rarely shared in proprietary formats & locally described The result is: a shortage of analysis techniques that can be applied across neuronal systems limited interaction between research centres with complementary expertise

7 Data in Science Bowker’s “Standard Scientific Model” 1.Collect data 2.Publish papers 3.Gradually loose the original data The New Knowledge Economy & Science & Technology Policy, G.C. Bowker Problems: –papers often draw conclusions from data that is not published –inability to replicate experiments –data cannot be re-used

8 Codes in Science Three stages for codes 1.Write code and apply to data 2.Publish papers 3.Gradually loose the original codes Problems: –papers often draw conclusions from codes that are not published –inability to replicate experiments –codes cannot be re-used

9 CARMEN enables sharing and collaborative exploitation of data, analysis code and expertise that are not physically collocated

10 CARMEN Project UK EPRSC e-Science Pilot £5M ( ) 20 Investigators Stirling St. Andrews Newcastle York Sheffield Cambridge Imperial Plymouth Warwick Leicester Manchester

11 Newcastle: Colin Ingram Paul Watson Stuart Baker Marcus Kaiser Phil Lord Evelyne Sernagor Tom Smulders Miles Whittington York: Jim Austin Tom Jackson Stirling: Leslie Smith Plymouth: Roman Borisyuk Cambridge: Stephen Eglen Warwick: Jianfeng Feng Sheffield: Kevin Gurney Paul Overton Manchester: Stefano Panzeri Leicester: Rodrigio Quian Quiroga Imperial: Simon Schultz St. Andrews: Anne Smith CARMEN Consortium

12 Industry & Associates

13 cracking the neural code neurone 1 neurone 2 neurone 3 raw voltage signal data typically collected using single or multi-electrode array recording Focus on Neural Activity

14 Epilepsy Exemplar Data analysis guides surgeon removing brain tissue WARNING! The next 2 Slides show an exposed brain

15 Epilepsy Exemplar Recording from removed tissue (up to 20 GB/h) On-line analysis by distributed collaborators will enable experiment to be defined during data collection Repository will enable integration of rare case types from different labs Advances in Treatment Data analysis guides surgeon removing brain tissue

16 e-Science Requirements Summary Sharing –data –code Capacity –vast data storage (100TB+ in CARMEN) –support data intensive analysis

17 CARMEN Cloud Architecture Data storage and analysis User access over Internet (typically via browser) Users upload data & services Users run analyses

18 e-Science Cloud Services Amazon (& Google) offer cloud computing –Basic storage & compute services –e.g. Amazon S3 & EC2 e-Science needs a set of higher-level services to support user needs Which services?....

19 CARMEN Cloud (CAIRN) Search for Data & Analysis Code Raw & Derived Data Store Structured Metadata Store Enabling Search & Annotation Analysis Code Store

20 Dynasoar Code Repository and Deployment –long term storage Code factored as Web Services –Standard (WS-I) interface –Internals not important Java, MatLab, C, C#,C++,... Deployers for a variety of service types –.war files (Tomcat), Virtual Machines (VMWare, Virtual PC),.NET assemblies, database stored procedures

21 Dynasoar: Dynamic Deployment 21 R The deployed service remains in place and can be re-used - unlike job scheduling A request to s4

22 Dynasoar 22 A request for s2 is routed to an existing deployment of the service

23 Performance Gains

24 Scalability

25 CARMEN Cloud (CAIRN) Search for Data & Analysis Code Raw Signal Data Search & Visualisation Enactment of scientific analysis processes Raw & Derived Data Store Security Policies Controlling Access to Data & Code Structured Metadata Store Enabling Search & Annotation Analysis Code Store

26 Controlled Sharing My collaborators can now see it Everyone can see it Only I am allowed to see this data Scientist

27 Security Solution XACML – standard way to encode rules as (subject, action, resource) triples Rules checked on each access

28 Controlled Sharing - conflicts My collaborators can now see it Only I am allowed to see this data All data must be accessible to everyone after the end of the project Scientist Funder

29 Addressing Conflicts Each party expresses policy as XACML rules Rules are converted to formal language –XACML -> VDM++ Run formal model to detect conflicts

30 OMII: Grimoire DAME: Signal Data Explorer OMII/ my Grid: Taverna OGSA-DAI, SRB, DAME Gold: Role & Task based Security my Grid & CISBAN Dynasoar CARMEN CAIRN

31 Using CARMEN for a typical scenario 1.Data Collection from a Multi-Electrode Array 2.Data Visualisation and Exploration 3.Spike Detection 4.Spike Sorting 5.Analysis 6.Visualisation of Analysis Results Currently, this is a semi-manual process CARMEN has automated this….

32 Web Portal

33 Raw Data Exploration with Signal Data Explorer

34 Defining the process with Workflow

35 Running a Workflow

36 SRB FileSystem RDBMS External Client Spike Sorting Service Reporting Dynamically Deployed Services in Dynasoar TAVERNA Registry INPUT Data OUTPUT Metadata Available Services Repository Security Workflow Engine Query Running the Workflow

37 Graphical Output

38 Movie Output

39 CARMEN (www.carmen.org.uk) is delivering an e-Science infrastructure that can be applied across a diverse range of applications uses a Cloud/Software as a Service architecture enables cooperation and interdisciplinary working aims to deliver new results in neuroscience, computer science and medicine


Download ppt "Cloud Computing for e-Science with CARMEN Paul Watson Newcastle University."

Similar presentations


Ads by Google