Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to Taverna Dr. Georgina Moulton and Stian Soiland The University of Manchester

Similar presentations


Presentation on theme: "An Introduction to Taverna Dr. Georgina Moulton and Stian Soiland The University of Manchester"— Presentation transcript:

1 An Introduction to Taverna Dr. Georgina Moulton and Stian Soiland The University of Manchester (georgina.moulton@manchester.ac.uk; ssoiland@cs.man.ac.uk )georgina.moulton@manchester.ac.ukssoiland@cs.man.ac.uk (on behalf of the my GRID team)

2 Outline of the day Introduction to workflows Introduction to Taverna –Case-studies Hands-on Taverna workshop –Build you own workflows –Explore features of Taverna Taverna in a little more detail

3 What you will learn No prior knowledge of workflow technology By the end of the tutorial participant will know how to –install the workbench software, import and run existing workflows and build their own from components available on the public internet. –use the semantic search technologies in myGrid assist this process by enabling service discovery –do basic troubleshooting of workflows using Taverna's fault tolerance and debug mechanisms –manage the import and export of data to and from the workflow system.

4 What is Taverna? Taverna enables the interoperation between databases and tools by providing a toolkit for composing, executing and managing workflow experiments Access to local and remote resources and analysis tools Automation of data flow Iteration over large data sets

5 Workflow language specifies how processes (web services) fit together Describes what you want to do, not how you want to do it High level workflow diagram separated from any lower level coding – you don’t have to be a coder to build workflows Workflow is a kind of script or protocol that you configure when you run it. -Easier to explain, share, relocate, reuse and repurpose. -Workflow Model -Workflow is the integrator of knowledge Workflows Repeat Masker Web service GenScan Web Service Blast Web Service Sequence Predicted Genes out

6 Two types of workflows Data workflows –A task is invoked once its expected data has been received, and when complete passes any resulting data downstream Control workflows –A task is invoked once its dependant tasks have completed A B CD E F

7 Williams-Beuren Syndrome (WBS) Contiguous sporadic gene deletion disorder 1/20,000 live births, caused by unequal crossover (homologous recombination) during meiosis Haploinsufficiency of the region results in the phenotype Multisystem phenotype – muscular, nervous, circulatory systems Characteristic facial features Unique cognitive profile Mental retardation (IQ 40-100, mean~60, ‘normal’ mean ~ 100 ) Outgoing personality, friendly nature, ‘charming’

8 Williams-Beuren Syndrome Microdeletion Chr 7 ~155 Mb ~1.5 Mb 7q11.23 GTF2I RFC2CYLN2 GTF2IRD1 NCF1 WBSCR1/E1f4H LIMK1ELNCLDN4CLDN3STX1A WBSCR18 WBSCR21 TBL2BCL7BBAZ1B FZD9 WBSCR5/LAB WBSCR22 FKBP6POM121 NOLR1 GTF2IRD2 C-cen C-midA-cen B-mid B-cen A-midB-telA-telC-tel WBSCR14 STAG3 PMS2L Block A FKBP6T POM121 NOLR1 Block C GTF2IP NCF1P GTF2IRD2P Block B ** WBS SVAS Patient deletions CTA-315H11CTB-51J22 ‘Gap’ Physical Map Eicher E, Clark R & She, X An Assessment of the Sequence Gaps: Unfinished Business in a Finished Human Genome. Nature Genetics Reviews (2004) 5:345-354 Hillier L et al. The DNA Sequence of Human Chromosome 7. Nature (2003) 424:157-164

9 Filling a genomic gap in silico Two steps to filling the genomic gap: 1.Identify new, overlapping sequence of interest 2.Characterise the new sequence at nucleotide and amino acid level Number of issues if we are to do it the traditional way: 1.Frequently repeated – info rapidly added to public databases 2.Time consuming and mundane 3.Don’t always get results 4.Huge amount of interrelated data is produced

10 Traditional Bioinformatics 12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa

11 Requirements Automation Reliability Repeatability Few programming skill required Works on distributed resources

12 ABC The Williams Workflows A: Identification of overlapping sequence B: Characterisation of nucleotide sequence C: Characterisation of protein sequence

13 The Biological Results CTA-315H11CTB-51J22 ELN WBSCR14 RP11-622P13 RP11-148M21RP11-731K22 314,004bp extension All nine known genes identified (40/45 exons identified) CLDN4CLDN3 STX1A WBSCR18 WBSCR21 WBSCR22 WBSCR24 WBSCR27 WBSCR28 Four workflow cycles totalling ~ 10 hours The gap was correctly closed and all known features identified

14 Workflow Advantages Automation –Capturing processes in an explicit manner –Tedium! Computers don’t get bored/distracted/hungry/impatient! –Saves repeated time and effort Modification, maintenance, substitution and personalisation Easy to share, explain, relocate, reuse and build Releases Scientists/Bioinformaticians to do other work Record –Provenance: what the data is like, where it came from, its quality –Management of data (LSID - Life Science Identifiers)

15 Benefit to the Scientist? Automated plumbing –Systematic. Making boring stuff easier so can do more funky stuff. Data chaining replaces manual hand-offs. Accelerated creation of results. Repetitive and unbiased analysis. Potentially reproducible but not always. Easier to use (but maybe not design) –Gives non-developers access to sophisticated codes and applications. Avoids need to download-install-learn how to use someone else's code. A framework to leverage a community’s applications, services, datasets and codes –Honours original codes and applications. Heterogeneous coding styles and tools sets. The best applications. –Promoting community metadata and common formats & standards A framework for extensibility, adaptability & innovation. –Add my code, reuse and repurpose

16 It’s more than plumbing…. Workflows are protocols and records. –Explicit and precise descriptions of a scientific protocol –Scientific transparency. Easier to explain, share, relocate, reuse and repurpose and remember. –Provenance of results for credibility. Workflows are know-how. –Specialists create applications; experts design and set parameters; inexperienced punch above their weight with sophisticated protocols Workflows are collaborations. –Multi-disciplinary workflows promote even broader collaborations.

17 In silico experiment lifecycle

18 Finding and Sharing Tools Taverna Workbench 3 rd Party Applications and Portals Workflow Enactor Service Management Results Management Log Meta data Default Data Store Custom Store DAS KAVEBAKLAVA Feta myExperiment Utopia Clients LSIDs Workflow enactor Part of a bigger picture (which we will talk in more detail later)

19 Taverna Workflow Workbench

20 Taverna Taverna is : –A workflow language based on a dataflow model. –A graphical editing environment for that language. –An invocation system to run instances of that language on data supplied by a user of the system. When you download it you get all this rolled into a single piece of desktop software The enactor can be run independently of the GUI Java based, runs on Windows, Mac OS, Linux, Solaris …. It doesn't necessarily run "on a grid". Can be used to access resources, either on a grid, or anywhere else.

21 OMII-UK Funded through the Open Middleware Infrastructure Institute (OMII-UK) as part of the my Grid project run by Carole Goble Four years old, funding secured through 2008 and beyond. Development team at Manchester & Hinxton, UK Wide group of ‘friends and allies’ across the world particularly within UK eScience Implemented in Java, released under LGPL licence.

22

23 Biomart query Soaplab operation wrapping an EMBOSS tool Workflow diagram Tree view of workflow structure Available services Version 1.5.1 Shown running on a Mac but written in Java, Runs & developed on Windows, OS X and Linux.

24 An Open World Open domain services and resources. Taverna accesses 3500+ operation. Third party. All the major providers –NCBI, DDBJ, EBI … Enforce NO common data model. Quality Web Services considered desirable.

25 Services Taverna can interoperate the following by default : –SOAP based web services –Biomart data warehouses –Soaplab wrapped command line tools –BioMoby services and object constructors (talk tomorrow) –Inline interpreted scripting (Java based) Other service classes can be added through an extension point (but you probably don’t need to)

26

27 Multi-disciplinary ~37000 downloads Ranked 210 on sourceforge Users in US, Singapore, UK, Europe, Australia, Systems biology Proteomics Gene/protein annotation Microarray data analysis Medical image analysis Heart simulations High throughput screening Phenotypical studies Plants, Mouse, Human Astronomy Aerospace Dilbert Cartoons

28 What do Scientists use Taverna for? Data gathering and annotating –Distributed data and knowledge Data analysis –Distributed analysis tools and Data mining and knowledge management –Hypothesis generation and modelling

29 Case Study – Graves Disease Autoimmune disease that causes hyperthyroidism Antibodies to the thyrotropin receptor result in constitutive activation of the receptor and increased levels of thyroid hormone Original my Grid Case Study Ref: Li P, Hayward K, Jennings C, Owen K, Oinn T, Stevens R, Pearce S and Wipat A (2004) Association of variations in NFKBIE with Graves? disease using classical and myGrid methodologies. UK e-Science All Hands Meeting 2004

30 Graves Disease The experiment: Analysing microarray data to determine genes differentially-expressed in Graves Disease patients and healthy controls Characterising these genes (and any proteins encoded by them) in an annotation pipeline From affymetrix probeset identifier, extract information about genes encoded in this region. For each gene, evidence is extracted from other data sources to potentially support it as a candidate for disease involvement

31 Annotation Pipeline Evidence includes: SNPs in coding and non-coding regions Protein products Protein structure and functional features Metabolic Pathways Gene Ontology terms

32

33 Data Analysis Access to local and remote analysis tool You start with your own data / public data of interest You need to analyse it to extract biological knowledge

34 Case study: Investigating Genotype- Phenotype Correlations in Trypanotolerance Fisher P, Hedeler C, Wolstencroft K, Hulme H, Noyes H, Kemp S, Stevens R, Brass A. (2007) A systematic strategy for large-scale analysis of genotype phenotype correlations: identification of candidate genes involved in African trypanosomiasis.Nucleic Acids Res.35(16):5625-33

35 Which genes are between two genes? Which genes are up-regulated in a data set? In which pathways is a set of genes involved? Why is one mouse resistant and another one susceptible? What did the immune system of the susceptible mouse do inappropriately? Which of the strain differences between resistant/susceptible mice are significant? Which of the differently activated pathways in resistant/susceptible mice are significant? Top-down: Immunology-driven SusceptibilityInfected host Which genes in a region are differently expressed in resistant/ susceptible mice and have SNPs? What are the expression levels of all genes involved in a particular pathway in resistant/susceptible mice? Which strain differences can be found in resistant/ susceptible mice? Which pathways are differently activated in resistant/susceptible mice? Simple Composition Biological Lessons General Complex Bottom-up: Data driven Genome Transcriptome Pathway Data sources

36 Bioinformatics Challenges Linking from genotype to phenotype –Integrated ‘omics (GIMS) –Microarray analysis –Working with the literature –Presentation of results to non-bioinformaticians –Separating cause and effect

37 Genotype to Phenotype

38 GenotypePhenotype ? Current Methods 200 What processes to investigate?

39 ? 200 Microarray + QTL Genes captured in microarray experiment and present in QTL (Quantitative Trait Loci ) region Genotype Phenotype Metabolic pathways Phenotypic response investigated using microarray in form of expressed genes or evidence provided through QTL mapping

40 Trypanosomiasis (“Sleeping sickness”) Trypanosoma species parasite Human sleeping sickness => T.brucei Cattle => T.congolense and T.vivax Major problem on cattle production in sub-Saharan Africa Symptoms include: –severe anaemia - weight loss –foetal abortion –cachexia and associated intermittent fever –Oedema - general loss of condition Some breeds of cattle are tolerate mild and moderate infections

41 Trypanosomiasis Quantitative Trait Loci data available for cattle and mouse Issues – to identify the genetic difference responsible for resistance and breed them into productive cattle. Only need to be right (not for the right reasons) Model system in mice

42 Mouse Model A/J or Balb/C strains are susceptible C57BL/6 (B6) are resistant QTL regions defined (Iraqi, Kemp, Gibson) –Tir1 (17.4-18.3cM Chr 17), Tir2 and Tir3 Which genes are responsible for resistance?

43 Tir1 region Contains > 130 genes including TNF and MHC region Markers not mapped Can microarray help? Issues: What tissue, what time?

44 Data used Samples were taken from Liver, Spleen and Kidney at time points 0, 3, 7, 9 and 17 days post infection for all three strains of mouse. In total 225 oligonucleotide arrays were used to capture cellular responses to infection, with 5 biological replicates per condition, and samples from 5 mice were used to create each biological replicate LOTS OF DATA RIGOUROUS STATISTICAL ANALYSIS

45

46 CHR QTL Gene A Gene B Pathway A Pathway B Pathway linked to phenotype – high priority Pathway not linked to phenotype – medium priority Pathway C Phenotype literature Gene C Pathway not linked to QTL – low priority Genotype

47

48 Key: A – Retrieve genes in QTL region B – Annotate genes with external database Ids C – Cross-reference Ids with KEGG gene ids D – Retrieve microarray data from MaxD database E – For each KEGG gene get the pathways it’s involved in F – For each pathway get a description of what it does G – For each KEGG gene get a description of what it does

49 Workflow Breakdown Stage 1: Microarray Analyses using MADAT and R Stage 2: Finding genes in the QTL Stage 3: Finding pathways Stage 4: Ranking gene lists by SNPs in susceptible vs resistant strains

50 Finding Genes in the QTL Find where probesets are in genomic sequence Get list of genes in that region – by searching mmusculus_Ensembl and then both Uniprot and Entrez gene Get the corresponding ids from KEGG Concatenate gene lists and remove duplicates

51 Finding Pathways For each gene, get a list of known pathways For each pathway get associated descriptions Merge pathways and remove duplicates

52 Finding SNPs and Ranking Genes Each QTL gene is analysed to determine any SNPs in the region Any SNPs are scored for informativeness Each gene is ranked in terms of its SNPs scores In the mouse model, strain AJ is more susceptible to trypanosome infection, so informative SNPs are those that are unique to this strain and have the same allele in all the other strains

53 Parallelising Tasks SNP analysis and pathway analysis can happen concurrently. They are using the same initial data, but there is no dependency between them Workflows allow this parallelisation of tasks and the behaviour is often implicit – the user does not have to specify that this should happen

54 Locations of Services Madat microarray software analysis package, including R statistical package – University of Manchester AffyMetrix data for mouse 430_2 BioMart mouse Ensembl database – searching Uniprot and Entrez gene – EBI, UK Kegg Gene IDs - Kanehisa Laboratory, Kyoto, Japan Kegg pathways and descriptions - Kanehisa Laboratory, Kyoto, Japan RankGene – SNP analysis service – University of Manchester

55 Workflow Features Shims Nested Workflows Iterations Output Collections

56 Shims 12 beanshell scripts in the workflow Beanshells allow users to build small, bespoke scripts for connecting incompatible services Example – ‘CreateReport’ takes the results from the BioMart mouse Ensembl query and collects them into a report

57 Fix up the services to be compatible Shims – libraries of adapters. Data incompatibility between services?

58 Nested workflows A processor can be a workflow itself. Encourages the reuse of workflows within a more complex scenario. Greater abstraction of an overall process making it more manageable.

59

60 Beanshell Mmusculus_gene_ensembl has 1 input and 7 outputs Each output has the results of iterating over each gene in the QTL The ‘CreateReport’ beanshell allows iteration 1 from each result to be collected together, followed by iteration 2 etc, so that they can be examined at a later date

61 Local Java Processor Split_by_regex Splits a list of gene identifiers into single gene identifiers for pathway analyses The list of genes produced from the QTL analysis are presented as one per line This Shim splits this file at every new line so that each gene identifier has its own file for further analysis

62 Before Workflows Genotype and phenotype correlations are difficult - fragmented data held in numerous data resources Error prone mappings between resources Lots of repetitive operations Scientists would choose most likely candidates for further investigation based on prior knowledge or experience – potentially missing important correlations

63 Workflow Results Trypanosomiasis resistance A strong candidate gene was found –Daxx gene not found using manual investigation methods –The gene was identified from analysis of biological pathway information –Possible candidate identified by Yan et al (2004): Daxx SNP info –Sequencing of the Daxx gene in Wet Lab showed mutations that is thought to change the structure of the protein –Mutation was published in scientific literature, noting its effect on the binding of Daxx protein to p53 protein – p53 plays direct role in cell death and apoptosis, one of the Trypanosomiasis phenotypes

64 Conclusions Automation has allowed a systematic analysis of a large data space The expression of genes and their pathways can be investigated with no prior knowledge – bias from prior knowledge is not introduced into the experiment The workflow is a permanent record of the experimental methods used – increasing the reproducibility and providing a starting-point for future modifications and additions to the protocols

65 An Automatic Annotation Pipeline Genome annotation pipelines – workflow assembles evidence for predicted genes / potential functions Human expert can ‘review’ this evidence before submission to the genome database Collaboration with the Bergen Center for Computational Science (computational biology unit) – Gene Prediction in Algal Viruses, a case study. Presented at NETAB2005 http://www.nettab.org/2005/docs/NETTAB2005_LanzenPost er.pdf

66 User Interaction Handling Interaction Service and corresponding Taverna processor allows a workflow to call out to an expert human user Used to embed the Artemis annotation editor within an otherwise automated genome annotation pipeline To set this up refer to Taverna project site Collaboration with the University of Bergen Ref: Poster, Nettab 2005

67 Iteration Repeated application of a process to multiple data items Processor takes a single list as inputs and enactor engine will invoke the processor multiple times and collate results into a new list.

68 Conditional Branching

69 Current Workflow Issues Web Service Stability Workflow discovery and reuse Experimental design Workflow implications

70 Web Service Stability Distributed computing –Users do no normally own the services they use –Workflow system providers do not usually own services What can we do when a service fails? –Find an alternative, add an alternative to invoke automatically –Allow users to rank services by their reliability

71 What about when a service fails? Most services are owned by other people No control over service failure Some are research level Workflows are only as good as the services they connect! To help - Taverna can: Notify failures Instigate retries Set criticality Substitute alternative services

72

73 Technology Stability Web services are gaining in popularity, most major bioinformatics service providers supply a web service interface to their resources The Web Service Description Language (WSDL) is currently a recommendation candidate with the W3C (World Wide Web Consortium).

74 Workflow Discovery and Reuse Workflows are useful experimental artefacts Reusing or repurposing existing workflows can save time and effort and should result in the perpetuation of ‘good’ workflow design Workflow fragments are also useful for starting new experiments

75 Workflows are hard work Often complex. –Need intelligent steering and analysis. –Need explanations to ensure used properly and safely. Challenging and expensive to develop. –Development assistance. Don’t start from scratch. –Take a long time to build good ones & a lot of know-how. You can still build crap workflows –Enable scientists to be scientists, not programmers. –Enable scientists to be creative yet sound.

76 Workflows are commodities Valuable first class assets in their own right. –To be pooled and shared and traded and reused –Within communities and across communities –Of pieces, of wholes, of when and how to –Pattern books. Validated community workflow packs –Publish and review Enable Mediocre Scientists to do the mundane just as much as the Great and Good. –Conservation of Work principle But…. Reusability often confined to the project it was conceived. Social and technical challenges for sharing and reuse.

77 Taverna: Record, Reuse, Recycle, Repurpose Trichuris muris - the mouse whipworm Trypanosomiasis cattle workflow reused without change over a new dataset Identified the biological pathways involved in sex dependence in the mouse model, previously believed to be involved in the ability of mice to expel the parasite. A manual two year study of candidate genes had failed to do this. Paul Fisher et al A Systematic Strategy for Large-Scale Unbiased Analysis of Genotype-Phenotype Correlations Bioinformatics in review

78 Workflow Reuse – Workflows are Scientific Protocols – Share them! Addisons Disease SNP design Protein annotation Microarray analysis myGrid Workflow Repository http://workflows.mygrid.org.uk/repository

79 A workflow marketplace

80 Started February 2007 A community social network for sharing workflows A gateway to other publisihing environments A platform for launching workflows Soon you will be able to see a workflow and just launch it – not just Taverna, but others like Kepler, Triana

81 Experimental Design Workflows can generate large amounts of data Gathering data no is longer a problem – but how do you analyse such a large numbers of files and high volumes? –Visualisation –Building data models –Populating data schemas

82 Experimental Design - Visualisation Other tools can use workflows as background processes UTOPIA is a visualisation tool for DNA and proteins UTOPIA displays workflow results in an interactive way, allowing scientists to explore their data and initiate further additional experiments

83 UTOPIA

84 WF Execution Engine Portal Middleware Results Provenance Warehouse Resources Design GUI Application Datasets Workflow Warehouse Service / Component Catalogue Resource Information Services Provenance http://www.kooprime.com Utopia

85 Experimental Design – Building Data Models Generate EMBL records as a step in the workflow Generate GFF models as a step in the workflow Results are now just one file

86 Experimental Design – Populating Data Schemas SBML (Systems Biology Mark-up Language) is the standard format for generating and sharing systems biology data A Taverna plug-in (Using libSMBL) allows SBML models to be consumed or produced from workflow experiments

87 MCISB Case Study Manchester Centre for Integrative Systems Biology –Case study for their informatics infrastructure Superimposing array data onto pathway maps using Taverna workflows [Peter Li, Doug Kell, 2006]

88 MCISB Study Involves Combining public and local data Gathering, creating and storing data in SBML models Visualising results using pre-existing SMBL-based tools Why? To view transcriptome data from the context of pathways To see the effects of up/down regulation on pathways

89 Pathway maps are first saved as SBML models using bespoke SMBL software - CellDesigner Glycolysis pathway

90 MaxD Read gene names of enzymes from SBML Cell Designer model Query Big Expt data in MaxD using gene names Calculate colour of enzyme nodes based on mRNA expression levels Create new SBML model Transcriptome pathway workflow

91 Integration of libSBML into Workflow

92 libSBML using the API Consumer Building the SBML model was possible because of Taverna’s extensibility Did not have to write new services – just used the API consumer to add SBML library API consumer can be downloaded from the my Grid website

93 JC_C-0.07-1_MeasurementJC_N-0.07-1_Measurement Decreased levels of GPM3 New SBML models viewed using Cell Designer

94 Results Visualisation of one data source over another by the building of a common model during the workflow run Promotes re-use by providing a simple interface for interpretation by the scientist Offers a proof of principle for the overlay of other omics data: –Proteomics data from PRIDE –Metabolomic data from MEMO

95 Transparency

96 Provenance Who, What, Where, When, Why?, How? Context Interpretation Logging & Debugging Reproducibility and repeatability Evidence & Audit Non-repudiation Credit and Attribution Credibility Accurate reuse and interpretation Smart re-running Cross experiment mining Just good scientific practice Smart Tea BioMOBY

97 Tracking From which Ensembl gene does pathway mmu004620 come from?

98 Provenance Workflow experiments can span several months, re-running the same workflow or comparing the results of several different workflows Scientists need to record: Who performed the experiment and when What the experiment was What services were invoked What the final and intermediate results were Some workflow systems enable this process provenance to be collected automatically – for Taverna, we have the my Grid LogBook

99

100 runsWorkflow launchedBy Organisation provenance Workflow Experimenter Organisation belongsTo hasInput executesProcessRun e.g. web service invocation of BLAST @ NCBI iteration e.g. BLAST @ NCBI Workflow run Process ProcessRun ProcessIteration Workflow provenance workflowOutput Data Data/ knowledge provenance Atomic Data derivedFrom Knowledge statements e.g. similar_sequence_to Knowledge statements e.g. similar_sequence_to createdBy Data Collection containsData isA runsProcess hasProcesses

101 Workflow Implications Workflows generate lots of data in a model that is easy to understand More and better resources are available to more scientists Changes in communication between laboratory scientists and in silico scientists Changes in work practices towards hypothesis generation Workflows can be shared, published, verified, repurposed and reused and scientists should always receive the credit for their creation.

102 Taverna Summary Automation Implicit iteration Implicit parallelisation Support for nested workflow construction Error handling –Retry, failover and automatic substitution of alternates

103 Extensibility Accepts many types of services: - web services, beanshell scripts, local java scripts, JDBC connections…etc Easy to add your own services Plug-in architecture Easy to build new processor types Easy to extend to include alternative results viewers

104 More information… Taverna http://taverna.sourceforge.nethttp://taverna.sourceforge.net my Grid http://www.mygrid.org.ukhttp://www.mygrid.org.uk OMII-UK http://www.omii.ac.ukhttp://www.omii.ac.uk

105 Carole Goble, Norman Paton, Robert Stevens, Anil Wipat, David De Roure, Steve Pettifer OMII-UK Tom Oinn, Daniele Turi, Katy Wolstencroft, June Finch, Stuart Owen, David Withers, Stian Soiland, Franck Tanoh, Matthew Gamble Research Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, Antoon Goderis, Alastair Hampshire, Qiuwei Yu, Wang Kaixuan, Current contributors Matthew Pocock, James Marsh, Khalid Belhajjame, PsyGrid project, Bergen people, EMBRACE people User Advocates and their bosses Simon Pearce, Claire Jennings, Hannah Tipney, May Tassabehji, Andy Brass, Paul Fisher, Peter Li, Simon Hubbard, Tracy Craddock, Doug Kell Past Contributors Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Ananth Krishna, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Juri Papay, Savas Parastatidis, Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Martin Senger, Nick Sharman, Victor Tan, Paul Watson and Chris Wroe. Industrial Dennis Quan, Sean Martin, Michael Niemi (IBM), Chimatica, Funders EPSRC, Wellcome Trust


Download ppt "An Introduction to Taverna Dr. Georgina Moulton and Stian Soiland The University of Manchester"

Similar presentations


Ads by Google