Update on StemBase, the database of the Stem Cell Genomics Project Miguel Andrade, OHRI, Ottawa.

1 Update on StemBase, the database of the Stem Cell Genomics Project Miguel Andrade, OHRI, Ottawa

4 The Stem Cell Genomics Project Objective: acquire a complete understanding of the genetic factors that: –specify stem cell identity and function; and –regulate commitment and differentiation Rationale: –Stem cells play an essential role in the human body as they provide the starting material for every organ and tissue –Knowledge of regulatory genes acting in and on stem cells is necessary to exploit their full therapeutic potential

5 STEM CELL NETWORK +20 GROUPS OGIC – GENOMICS PLATFORM Pearl Campbell Bioinformatics Group StemBase samples data PUBLIC Yauk CL, Berndt ML, Williams A, Douglas GR. 2004. Comprehensive comparison of six microarray technologies. Nucleic Acids Research. 32(15): p. e124

6 Working StemBase Working StemBase Published StemBase Published StemBase Public StemBase Public StemBase Data production Data production Bioinformatics analysis Bioinformatics analysis Submitter Network investigators Network investigators 4 Approve release? 4 Approve release? 1 Sample 2 Data 3 Data 5 Embargo or remove 5 Publish data YES NO 4 Months each sample 12 months entire StemBase 6 Publicize data Data sharing model

7 SCN Sample Contributors Jane Aubin Mickie Bhatia John Dick Connie Eaves Jaques Galipeau Alain Garnier Marina Gertsenstein May Griffith John Hassell Norman Iscove Michael McBurney Rod McInnes Kelly McNagny Lynn Megeney James Piret Derrick Rancourt Janet Rossant Lawrence Rosenberg Michael Rudnicki Luc Sabourin Guy Savageau Ruth Slack Jacques P. Tremblay Valerie Wallace T. Michael Underhill Sue Varmuza Samuel Weiss Juan Carlos Zuniga-Pflucker Peter Zandstra

8 Data collection 191 samples 1/4 Human 3/4 Mouse … Rat

10 StemBase Database of gene expression data in mouse and human stem cells Affymetrix DNA microarray data. 185 samples. Serial Analysis of Gene Expression. 6 samples. Study genes important for stem cell function Perez-Iratxeta, C., G. Palidwor, C.J. Porter, N.A. Sanche, M.R. Huska, B.P. Suomela, E.M. Muro, P. Krzyzanowski, E. Hughes, P.A. Campbell, M.A. Rudnicki and M.A. Andrade. 2005. Study of stem cell function using microarray experiments. FEBS Letters. 579, 1795-1801.

11 Usage (1 Nov 2005) 258 non SCN accounts 3,000 logins First commercial license

12 Public web server










22 Data submission to GEO

23 Required by some journals Largest gene expression data repository StemBase: High quality data Focused on stem cells More functionality GEO as a traffic catch Data submission to GEO

24 Dermis Adipose Neural Myoblasts Bone marrow Muscle Myospheres Bone marrow Osteoblasts Retinal primary Retinal first passage Mammospheres Mammospheres undifferenciated Neurospheres Bone marrow Cancer R1 serum64 Cancer Embyoid bodies R1 serum6999 D4D D4E Embyoid bodies J1 C2E D4A R1 V6.5 C2D Embyoid bodies C2A Dim1 Dim2 Mouse / MOE430 Carolina Perez-Iratxeta

25 Cord blood Bone marrow Cord blood Bone marrow Cord blood Peripheral Fetal Myoblasts I6 Retinal first passage Myoblasts differentiated Retinal primary Hela M-O7e M-O7e Smad7 Kidney I6 Human / HGU133 Carolina Perez-Iratxeta Dim1 Dim2

26 Analysis of genes correlated with Oct4 self-renewal totipotency Oct4 lineage commitment ESEB Hypothesis: Analysis of genes correlated with Oct4 in stem cells will allow the identification of genes that are important for stem cell identity and lineage commitment Pearl Campbell

27 Haematopioetic Embryonic Osteoblasts Neurospheres Cancer C2C12 Paul Krzyzanowski

28 Time Series Time series analysis: differentiation three types of mESC (R1, J1, V6.5) into embryoid bodies 11 time points: 0hr, 6hr, 12hr, 18hr, 24hr, 36hr, 48h, 4day, 7day, 9day and 14day Hailesellasse et al. 2005. Search for genes important in mouse embryonic stem cell differentiation. Submitted.

29 12h<0h 6h<0h 6h>0h 12h>0h P1 Egr1 P2 Klf4 P3 Klf4 P4 D930005D10Rik P5 Pim3 P6 Rtn4 P7 - P8 Socs3 P9 Fblim1 P10 Epha5 P11 Sfrs5 P12 Tagln P13 1190003J15Rik P14 Herc1 P15 Spry4 P16 Fblim1 P17 Mras P18 Anxa3 P19 Zfp54 P20 Rbm5 P21 4930431H11Rik P22 Khsrp P23 1110061A14Rik P24 Rbm14

30 m. Klf4 s. Khsrp n. Egr1 o. Sfrs5 c. Socs3 q. Rbm5 r. Zfp54 f. Tagln d. Epha5 (Eph receptor A5) g. Herc1 a. Fblim1 b. 1190003J15Rik SPRY HECTc RCC1 WD40RCC1 TR_THY SOCS box RRM_1 G-patch l. D930005D10Rik e. Pim3 h. Spry4 i. Mras h. Anxa3 p. Rbm14 RRM_1 j. Rtn4 from down-regulated genes / signaling related SAM 2 Calponin repeat Annexin repeats from down-regulated genes / nucleotide binding Zn finger domains zf-RanBP Zn finger domains from up-regulated genes

31 Egr1 Socs3 Sfrs5 Zfp54 Klf4 Khsrp Rbm5 Zinc ion binding (GO:0043167) Nucleic acid binding (GO:0003676) Tagln Epha5 Fblim1 DNA binding RNA binding Herc1 Signal transduction (GO:0007165) D930005D10Rik Rtn4 Pim3 1190003J15Rik Spry4Mras Anxa3 Rbm14 Protein amino acid phosphorylation (GO:0006468) Morphogenesis (GO:000653) Protein modification (GO:0006464)

32 Integration of proteomics data New analysis features External datasets: More stem cell data Cancer sets (e.g. NCI) Normal tissue Future

34 Raw microarray data in StemBase as of 23 Oct 2005 154 samples 426 replicates 4 DVDs Lily Jin / Gareth Palidwor DVD release Oct 2005

35 Proteomics Lawrence Puente Lynn Megeney Genomics Core Facility Pearl Campbell Ottawa Genome Centre Michael Rudnicki (director) William Read (project manager) Stem Cell Network Norman Iscove (U of T) Tim Hughes (U of T)

36 Bioinformatics Miguel Andrade Enrique Muro Gareth Palidwor Carolina Perez-Iratxeta Lily Jin (-Sept05) Christopher PorterPaul Krzyzanowski Kagnew Hailesellasse (-Aug05) Neal Sanche (03-04) Andrew Kysyk

37 Thanks!

