Presentation is loading. Please wait.

Presentation is loading. Please wait.

VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to.

Similar presentations


Presentation on theme: "VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to."— Presentation transcript:

1 VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to EBI, Sanger and ND)

2 VectorBase Outline 1.Project goals 2.What’s currently available 3.Current challenges and future plans

3 VectorBase Project goals For vector biologists: –Easy access to gene expression data consistent data processing For array specialists: –ArrayExpress submission –Advanced analysis tools –Array annotation

4 VectorBase BULK LOADER EXPRESSION DATA STORAGE & ANALYSIS BASE: BioArray Software Environment http://base.thep.lu.se/ Open source, active development and user community LIMS, data storage, export and analysis Web-based, user/group access control BASE 2.x adoption will bring Affy support

5 Data submission Community submission guidelines available First batch of experiments loaded by us Bulk data loader Sample/experiment annotation requires intervention from curators

6 VectorBase BULK LOADER EXPRESSION DATA STORAGE & ANALYSIS ArrayExpress ‘PUBLIC’ STORAGE Data held in BASE is largely MIAME compliant Script for semi- automated export in TAB2MAGE format One experiment submitted so far

7 VectorBase BULK LOADER EXPRESSION DATA STORAGE & ANALYSIS ArrayExpress ‘PUBLIC’ STORAGE

8 VectorBase BULK LOADER EXPRESSION DATA STORAGE & ANALYSIS ArrayExpress ‘PUBLIC’ STORAGE DATA SUMMARIES BASE web interface offers powerful and extendable analysis environment Can be used for multi- site collaborations on pre-publication data Steep learning curve/not 100% intuitive Not easily linked to We provide simpler views so the casual user can quickly draw biological inferences

9

10

11

12

13

14

15 VectorBase

16 Standardised data All displayed data is processed in the same way: 1.Poor quality spots removed Currently using submitted spot flags 2.Normalisation “lowess” for two-colour experiments

17 VectorBase

18 BULK LOADER EXPRESSION DATA STORAGE & ANALYSIS ArrayExpress ‘PUBLIC’ STORAGE DATA SUMMARIES PROBE MAPPING 3 probe types 6 array designs Mapping handled via Ensembl pipeline: –Oligo  exonerate –PCR  e-PCR –cDNA  exonerate2genes

19 VectorBase GENOMIC DATA AUTOMATIC ANNOTATION GENOME BROWSER VectorBase BULK LOADER EXPRESSION DATA STORAGE & ANALYSIS ArrayExpress ‘PUBLIC’ STORAGE DATA SUMMARIES PROBE MAPPING GFF3

20 VectorBase contigview

21 VectorBase featureview

22 VectorBase

23 BULK LOADER EXPRESSION DATA STORAGE & ANALYSIS VECTOR BIOLOGISTS ARRAY BIOLOGISTSGENOME BIOLOGISTS ArrayExpress ‘PUBLIC’ STORAGE VectorBase GENOMIC DATA AUTOMATIC ANNOTATION GENOME BROWSER DATA SUMMARIES PROBE MAPPING DATA MINING

24

25

26

27 VectorBase BioMart Beta version currently available –http://base.vectorbase.org:9999/biomart/martview Improvements still needed: –experiment annotations –Alignments (i.e. handle split alignments) Federation with current marts Integration with new data?

28 VectorBase Current challenges and future plans How do you want to query? CVs & ontologies APIs Community submission Manual annotation

29 VectorBase Querying strategy What do you want to query on? –Fetch all genes upregulated under condition X –Fetch all experiments with gene X and condition Y –Fetch all probes with expression similar to probe X All essentially boil down to: –Define probe (genes etc) –Define significant expression ANOVA? Up/down-regulation WRT what? –Define experimental conditions Sample annotation Experimental design

30 BULK LOADER EXPRESSION DATA STORAGE & ANALYSIS VECTOR BIOLOGISTS ARRAY BIOLOGISTSGENOME BIOLOGISTS CV / ONTOLOGY ArrayExpress ‘PUBLIC’ STORAGE GENOMIC DATA AUTOMATIC ANNOTATION GENOME BROWSER DATA SUMMARIES PROBE MAPPING DATA MINING

31 STORAGE & ANALYSIS ‘PUBLIC’ STORAGE GENOME BROWSER DATA SUMMARIES DATA MINING BULK LOADER EXPRESSION DATA GENOMIC DATA AUTOMATIC ANNOTATION CV / ONTOLOGY ArrayExpress Array API ? AE API ?e! API MartJ / MQL PROBE MAPPING

32 VectorBase Array API Perl / Java objects for retrieval / handling of array data –Dual purpose: Consistency & efficiency of VB expression website Computational access to VB data for all –Objects must be: General, DB-independent Compatible with pre-existing Bio API (BioPerl / BioJava) –Nb. May be pre-existing solution: ArrayExpress API? BioPerl-Expression? MAGE-OM-stk http://neuron.cse.nd.edu/vectorbase/index.php/Array_API_proposal

33 VectorBase

34 Community data submission Carrot? –Help with ArrayExpress submission –Analysis tools –Dissemination Stick? –Outreach (courses, conferences) –Networking

35 VectorBase GE data  manual annotators Gene-build designed arrays –Negative evidence less compelling EST clone-based arrays –http://tinyurl.com/vlkwo

36 VectorBase Longer term plans  Host-parasite GE data integration & analysis  GE-clusters  “upstream” regions  regulatory elements, upstream TFs  RNAi phenotypes  Images

37 VectorBase

38

39 CVs & ontologies Integrate MGED and specialist ontologies for –Body parts –Developmental stages –Disease processes –… Allows comparison across experiments with similar experimental conditions

40 BioMart Most biomarts: Gene-based Mostly ‘binary’ data –e.g. a gene either has a signal domain or doesn’t Easily linked with other (gene-based) biomarts VB Biomart: Probe based –Many probes not aligned Exp data less clear –e.g. define ‘differential expression’ Exports gene/trans IDs for linking to other Marts

41 VectorBase Clustering A priority? Easy to do on reporter level within experiments Harder to do at gene level across all experiments –Binary gene profile: “yes/no differentially expressed in experiment” ? Amazon-style links to “genes which may have similar expression profiles”?

42 VectorBase BASE 2.x Adoption delayed, now in progress Brings Affymetrix support Cleaner/modern interface Better API (Java)


Download ppt "VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to."

Similar presentations


Ads by Google