EScience Meeting, Edinburgh, November 2006. Slide 1 CARMEN Code Analysis, Repository and Modelling for e-Neuroscience Jim Austin, Colin Ingram, Leslie.

Slides:



Advertisements
Similar presentations
Requirements Engineering Processes – 2
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
© Jim Barritt 2005School of Biological Sciences, Victoria University, Wellington MSc Student Supervisors : Dr Stephen Hartley, Dr Marcus Frean Victoria.
international strategic management
Chapter 12 Decision Support Systems
A Mobility Model for Studying Wireless Communication Raymond Greenlaw Armstrong Atlantic State University Savannah, GA, USA Sanpawat Kantabutra Chiang.
Chapter 7 System Models.
Requirements Engineering Process
Cognitive Radio Communications and Networks: Principles and Practice By A. M. Wyglinski, M. Nekovee, Y. T. Hou (Elsevier, December 2009) 1 Chapter 12 Cross-Layer.
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
DCV: A Causality Detection Approach for Large- scale Dynamic Collaboration Environments Jiang-Ming Yang Microsoft Research Asia Ning Gu, Qi-Wei Zhang,
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
1 Validation & Measurement Methods for the PHARE Demonstrations R A Whitaker Validation Project Leader.
Measurements and Their Uncertainty 3.1
1 NECOBELAC Project WORK PACKAGE 3 Cross-national advocacy infrastructure.
1 Correlation and Simple Regression. 2 Introduction Interested in the relationships between variables. What will happen to one variable if another is.
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Chapter 3 Critically reviewing the literature
Using search for engineering diagnostics and prognostics Jim Austin.
Grid Computing Workshop, Stirling, October Slide 1 CARMEN or Neuroinformatics: what can E-Science offer Neuroscience or E-Science, and Neuroscience:
CARMEN: Code Analysis, Repository and Modelling for e-Neuroscience.
Intro to LPA Feb 11©2011 EDAC All Rights ReservedSlide 1 The Executive Development Assessment Centre Introduction to the LPA February 2011.
Configuration management
Presenter: Beresford Riley, Government of
Software change management
Selecting an Advanced Energy Management System May 2007 Chris Greenwell – Director Energy Markets Scott Muench - Manager Technical Sales © 2007 Tridium,
The Roles of a Sports Coach
Provenance-Aware Storage Systems Margo Seltzer April 29, 2005.
The Platform as a Service Model for Networking Eric Keller, Jennifer Rexford Princeton University INM/WREN 2010.
Campaign Overview Mailers Mailing Lists
Yong Choi School of Business CSU, Bakersfield
NYC DOE – Office of Teacher Effectiveness A
1 NEST New and emerging science and technology EUROPEAN COMMISSION - 6th Framework programme : Anticipating Scientific and Technological Needs.
Why Do You Want To Work For Us?
Digital Futures International Forum - Tuesday 18th September 1 Digital Futures International Forum The Digitisation Standard: Back & Forth Stephen Clarke.
The world leader in serving science TQ ANALYST SOFTWARE Putting your applications on target.
31242/32549 Advanced Internet Programming Advanced Java Programming
1 of 35 Dr. Anne Adams Esteem Dissemination.
Science as a Process Chapter 1 Section 2.
Who are the Experts?Simon KampaSlide 1 Who are the Experts? Simon Kampa IAM Group University of Southampton
Slide: 1 Welcome to the workshop ESRFUP-WP7 User Single Entry Point.
RTI Implementer Webinar Series: Establishing a Screening Process
Maths Counts Insights into Lesson Study
1 Phase III: Planning Action Developing Improvement Plans.
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
St. Paul Public Television Quality Workshop - July 8-9, 2010
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
From Model-based to Model-driven Design of User Interfaces.
New Opportunities for Load Balancing in Network-Wide Intrusion Detection Systems Victor Heorhiadi, Michael K. Reiter, Vyas Sekar UNC Chapel Hill UNC Chapel.
Cloud Computing for e-Science with CARMEN Paul Watson Newcastle University.
Science Cloud Paul Watson Newcastle University, UK
Processing raw electrophysiological signals in CARMEN:detecting and sorting spikes Leslie Smith University of Stirling.
Slide 1 The Sociology of Ontologies in Neurosciences Phillip Lord, School of Computing Science, Newcastle University.
Metadata For CARMEN Phillip Lord and Frank Gibson.
Digital Curation or Digital Data? The impact of Services and Federation Phil Lord Newcastle University.
Microsoft Research Faculty Summit Paul Watson Professor of Computer Science Newcastle University, UK.
Science Fair How To Get Started… (
A Practical Approach to Metadata Management Mark Jessop Prof. Jim Austin University of York.
Sharing the knowledge of electrophysiology data Phillip Lord, Frank Gibson and the CARMEN Consortium.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
Joslynn Lee – Data Science Educator
The CARMEN e-Science pilot project: Neuroinformatics work packages.
Presentation transcript:

eScience Meeting, Edinburgh, November Slide 1 CARMEN Code Analysis, Repository and Modelling for e-Neuroscience Jim Austin, Colin Ingram, Leslie Smith, Paul Watson, Stuart Baker, Roman Borisyuk, Stephen Eglen, Jianfeng Feng, Kevin Gurney, Tom Jackson, Marcus Kaiser, Phillip Lord, Stefano Panzeri, Rodrigo Quian Quiroga, Simon Schultz, Evelyne Sernagor, V. Anne Smith, Tom Smulders, Miles Whittington.

eScience Meeting, Edinburgh, November Slide 2 CARMEN is a new e-Science Pilot Project, (UK research council funded) in Neuroinformatics. Objectives: To create a grid-enabled, real time virtual laboratory environment for neurophysiological data To develop an extensible toolkit for data extraction, analysis and modelling To provide a repository for archiving, sharing, integration and discovery of data To achieve wide community and commercial engagement in developing and using CARMEN CARMEN is a 4 year project: if it is to last longer, it must become financially self- sufficient. See The CARMEN Project neurone 1 neurone 2 neurone 3

eScience Meeting, Edinburgh, November Slide 3 CARMEN Consortium Leadership & Infrastructure Colin Ingram Paul Watson Leslie SmithJim Austin

eScience Meeting, Edinburgh, November Slide 4 CARMEN Consortium Work Packages University of St Andrews The University Of Sheffield

eScience Meeting, Edinburgh, November Slide 5 Background: What is Neuroinformatics? Informatics applied to Neuroscience (of all sorts) Experimental Neuroscience: Data recording, data analysis have used computers for a long time. But a great deal more can be achieved by pooling data and analysis services Cognitive and Computational Neuroscience Modelling, Matching models to more experimental data Matching models to known appropriate behaviour Defining and running more sophisticated models Running models in real time Clinical Neuroscience Data-based understanding of neuropathology Neuropharmaceutical assays and assessment

eScience Meeting, Edinburgh, November Slide 6 What is Neuroinformatics bringing to Experimental Neuroscience? Getting leverage from e-Science capabilities to allow better use of data. Example: Dataset re-use: Experimenter does experiment, records data, analyses data, writes the paper, perhaps makes the data available to a small number of colleagues. …and then? The dataset languishes, first on a spinning disk, then later on some DVDs, then later still, is lost to view, as the experimenter changes lab,… Yet the data could be of use to other researchers…

eScience Meeting, Edinburgh, November Slide 7 What are the basic problems holding back dataset re-use? (1) Two major technical problems: Data format, and Metadata Data Format There are different systems for Neuroscience data collection. The data format is a particular structure The structure may be Proprietary: defined by a particular piece of software, and not made public Locally generated: defined by a locally written piece of software, but not necessarily well documented Public, but no suitable converter exists for the intending user

eScience Meeting, Edinburgh, November Slide 8 What are the basic problems holding back dataset re-use? (2) Metadata problems The data itself is useless unless the re-user knows exactly what the data represents. (Presumably the experimenter knew) But did they record this information in an accessible way? Metadata is data about the dataset How was it generated? What were the experimental conditions? What was the culture, or what preparation, or what animal,…? What was the temperature of the recording? Etc. etc. If the data is to be readily re-used these metadata problems need to be solved in a directly usable way Simply describing the protocol in English is not enough Cant automate reading J. Neurosci yet! There needs to be an automatically processable way of describing the experimental protocol. Particularly true is datasets are used for a large-scale survey of data e.g. for data-mining.

eScience Meeting, Edinburgh, November Slide 9 Enabling Neuroinformatics based collaboration Solving data format problems Force users to adopt a common format? Alienates users: they wont do it unless they can see real benefits Support documented formats Adopt a common internal format, providing translators to & from this format Rely on proprietary format owners to come aboard because of customer pressure

eScience Meeting, Edinburgh, November Slide 10 Enabling Neuroinformatics based collaboration : solving metadata problems Difficult problem: there are a number of attempts at solving it: BrainML: (Cornell) brainml.org BrainML is a developing initiative to provide a standard XML metaformat for exchanging neuroscience data. It focuses on layered definitions built over a common core in order to support community-driven extension. NeuroML: NeuroML is an XML-based description language for defining and exchanging neuronal cell, network and modeling data including reconstructions of cell anatomy, membrane physiology, electrophysiological data, network connectivity, and model specification Relevant not only for Neuroinformatics and experimental neuroscience: Part of a cross-cutting problem for all aspects of neuroscience.

eScience Meeting, Edinburgh, November Slide 11 As well as metadata systems for neuronal systems, there are related metadata systems which can be used by BrainML and NeuroML, ChannelML: for defining ion channel models MorphML: for defining the morphology of a neuron SBML: Systems Biology markup language: models of biochemical reaction networks CellML:to store and exchange computer-based mathematical models SBML is particularly well advanced: see MathML: for describing mathematical notation and capturing both its structure and content. See Metadata is a big but soluble problem. It is a multi-level problem, but the systems above provide a multi-level solution. Solving Metadata problems continued:

eScience Meeting, Edinburgh, November Slide 12 Enabling Neuroinformatics based collaboration: Sociological problems There is a reluctance to permit re-use amongst some experimental neuroscientists. What do experimental neuroscientists get from allowing others to reuse their data? If the answer is only better science, then some experimental neuroscientists will not come on board. They need to be convinced sharing that their hard-earned datasets will be of benefit to them Names on papers? The ability to be involved in the further research? At the very least, some credit! Some neuroscientists fear that their data will be used without their knowledge There is therefore some reticence amongst the experimental neuroscience community.

eScience Meeting, Edinburgh, November Slide 13 Solving sociological problems There are technical aspects to solutions: Security aspects on the holding of data: Ensure that datasets can be secured: for example that they can only be re-used with the experimenters permission. Security is critically important for holding of data which is still being analysed prior to publication. …and non-technical aspects too Bringing experimental neuroscientists on board Ensuring that the Neuroinformatics community is properly cross- disciplinary, with good representation from the experimentalists. Getting journals on-side Many journals are demanding that raw/processed data be made available in order to check results.

eScience Meeting, Edinburgh, November Slide 14 Neuroinformatics and clinical neuroscience Clinical Neuroscience is about treatment of Mental illness Brain diseases Trauma Neuroinformatics has major application here, ranging from 3d imaging technologies to EEG recordings: much broader than the focus of CARMEN. CARMEN is primarily concerned with neural recordings. These can provide data on neurochemical effects on neural function. Overall brain states (mental illness, disease) are believed to originate in the neurochemistry (research in depression and schizophrenia suggests this).

eScience Meeting, Edinburgh, November Slide 15 Neuropharmaceutical assays Neural cell cultures Different types: Slice preparations Cultures grown from neural cell lines Cultures from neonate neurons …have recordings made from them with and without added neuropharmaceuticals. Interest is on changes in behaviour in these preparations. Requires instrumentation and analysis techniques Sharing these results can lead to major advances Pharmaceutical companies are interested Security implications

eScience Meeting, Edinburgh, November Slide 16 Work Packages WP 0 Data Storage & Analysis WP1 Spike Detection & Sorting WP2 Information Theoretic Analysis of Derived Signals WP 3 Data-Driven Parameter Determination in Conductance- Based Models WP4 Measurement and Visualisation of Spike Synchronisation WP5 Multilevel Analysis and Modelling in Networks WP4 Intelligent Database Querying

eScience Meeting, Edinburgh, November Slide 17 CARMEN Objectives Create virtual laboratory for neurophysiological data Provide repository for: archiving, sharing, integration and discovery of data services that operate on the data Develop extensible toolkit for data extraction, analysis and modelling Achieve wide community and commercial engagement in CARMEN CARMEN must become financially self-sufficient after 4 years

eScience Meeting, Edinburgh, November Slide 18 Bowkers Standard Scientific Model 1 1.Collect data 2.Publish papers 3.Gradually loose the original data 1 The New Knowledge Economy and Science and Technology Policy, G.C. Bowker, E Problems: papers often draw conclusions from unpublished data inability to replicate experiments data cannot be re-used Data in Science

eScience Meeting, Edinburgh, November Slide 19 Bowkers Model Collect data Publish papers Gradually loose the original data Problems: papers often draw conclusions from unpublished data inability to replicate experiments data cannot be re-used Solution Data Repositories Computational Science Write codes Publish papers Gradually loose the codes Problems: papers often draw conclusions from the results of unpublished codes inability to replicate experiments codes cannot be re-used Solution Service Repositories but… codes can be lost too

eScience Meeting, Edinburgh, November Slide 20 CARMEN Active Information Repository

eScience Meeting, Edinburgh, November Slide 21 Dynamic Service Deployment - Dynasoar R

eScience Meeting, Edinburgh, November Slide 22 DAME developed a tool to analyse large volumes of distributed signal data CARMEN will extend this to: allow search and management of labelled data link the search results to data descriptions to allow better ranking and data analysis Data Exploration

eScience Meeting, Edinburgh, November Slide 23 Metadata Import tools enable users to describe experimental conditions Analysis services describe their own functionality Registry of data and services is there any data captured under conditions x, y & z? what services are available to process this spike train data? Automatic provenance generation

eScience Meeting, Edinburgh, November Slide 24 e-Science Stretch Tool for locating patterns in time-series data across multiple levels of abstraction Dynamic service provisioning over a grid Extensible, standardised metadata for neuroscience Fine-grained access control Integrating data from multiple repositories

eScience Meeting, Edinburgh, November Slide 25 Work Packages WP 0 Data Storage & Analysis WP1 Spike Detection & Sorting WP2 Information Theoretic Analysis of Derived Signals WP 3 Data-Driven Parameter Determination in Conductance-Based Models WP4 Measurement and Visualisation of Spike Synchronisation WP5 Multilevel Analysis and Modelling in Networks WP4 Intelligent Database Querying

eScience Meeting, Edinburgh, November Slide 26 Spike Detection & Sorting (WP1: Stirling & Leicester) Analogue recording (digitised) Cluster 1 Cluster 2 Doesnt fit! (Clustering using wave_clus)

eScience Meeting, Edinburgh, November Slide 27 CARMEN and spike detection and sorting Idea is to provide many services Several different types of spike detection algorithms Several different types of spike sorting techniques (including different types of data reduction, as well as different types of clustering) Allow the user to test with a variety of techniques, and then choose the techniques they prefer High speed links should allow immediate transfer of some datasets to Grid based systems Allow experimentalist to choose near-real-time detection and sorting for immediate feedback To assist during the experiment Slower (and more effective) techniques for later analysis off-line. Allow comparison of different techniques on a wide variety of data Which is best, and for what?

eScience Meeting, Edinburgh, November Slide 28 Information Theoretic Analysis of Electrically- and Optically- Derived Signals (WP2: Imperial College, Manchester, and UCL) Action potentials will be detected both electrically and optically. Action potentials (spikes) are the primary electrical communication mechanism between neurons. How can one interpret neuronal action potentials? Information Theory, can be used to establish the neuronal code quantifying how much information is carried by different potential neuronal coding mechanisms. Issues: Sampling problems, Spike correlation, Multimodal recording By using Grid technology, we can assemble large quantities of optical and multi-electrode recordings and apply existing and novel techniques to its analysis. We can make these techniques available as services.

eScience Meeting, Edinburgh, November Slide 29 Data-Driven Parameter Determination in Conductance-Based Models (WP3: Sheffield: Gurney et al) Neuron modelling uses conductance based models. [Many ionic species cross the neural membrane. Ion channels embedded in the membrane accomplish this transport There are many different ion channel types Setting the parameters for each type would enable better understanding of neuron operation] Determining parameters for neural models is difficult and requires a great deal of data. The parameters are not constant, and vary with (e.g.) Cell type Presence and concentration of neuromodulators Temperature CARMEN aims to provide this volume of data, and hence to enable many of these parameters to be determined.

eScience Meeting, Edinburgh, November Slide 30 Compartmental modelling of morphology Real neural morphology approximation (Passive) electrical equivalent

eScience Meeting, Edinburgh, November Slide 31 Fitting the model to current clamp data 100 ms 50 mV Data (Wilson) Model Wood et al., Neurocomputing, 2003

eScience Meeting, Edinburgh, November Slide 32 Measurement and Visualisation of Spike Synchronisation (WP5: Newcastle, Plymouth) More advanced analytical techniques are required to handle large scale, simultaneous recordings arising from MEAs. Visualisation is critical to understanding what is happening WP5 aims to: develop reliable and robust analysis techniques to address these issues, particularly sweeping statistical methods to test if measures show significant changes develop novel visualisation methods for displaying the results from these techniques, particularly those working in high- dimensional space conduct real-time analysis of spike coding through a distributed Grid-enabled virtual laboratory. Advanced (but fast) visualisation techniques are important to the whole community using CARMEN

eScience Meeting, Edinburgh, November Slide 33 Gravitational Clustering Particle aggregation in gravitational clustering, (Gerstein and Lindsey(2006)). Each particle represents a cell; charges on each cells are incremented with each spike, and a force occurs between particles dependent on the charges.

eScience Meeting, Edinburgh, November Slide 34 Multilevel Analysis and Modelling in Networks (WP6: Newcastle, St. Andrews, Cambridge) This WP aims to integrate the work of WP1-4, using the technology of WP0. Understanding activity dynamics within neuronal networks is a major challenge in neuroscience requires simultaneous recording from large numbers of neurons. This WP will provide integration of existing and novel network analysis techniques into CARMEN in order to build comprehensive models of network dynamics data of exceptional quality and detailed provenance for the CARMEN repository for analysis of network properties development of new dynamic Bayesian network algorithms to trace paths of neural information flow in networks. For example: waves of activity in early turtle retina, recorded using Ca++ sensitive dye. (Thanks to Evelyne Sernagor, ION, Newcastle University)

eScience Meeting, Edinburgh, November Slide 35

eScience Meeting, Edinburgh, November Slide 36

eScience Meeting, Edinburgh, November Slide 37 Concluding remarks CARMEN is a recent project (funding started October 2006). The baseline support technology is still being assembled. Its not the first attempt at making neurophysiological recordings re-usable But: CARMEN will contain more than recordings Services, workflows, capability of using multiple data formats CARMEN builds on earlier e_Science projects Re-use not re-invention We have experimental neuroscientists, informaticians, and computational neuroscientists all on board Tackling the broad range of issues from multiple perspectives.