Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genopolis Microarray DB a Progress Report Marco Brandizi Dec 12, 2005 Dottorato in Informatica XIX Ciclo.

Similar presentations


Presentation on theme: "Genopolis Microarray DB a Progress Report Marco Brandizi Dec 12, 2005 Dottorato in Informatica XIX Ciclo."— Presentation transcript:

1 Genopolis Microarray DB a Progress Report Marco Brandizi Dec 12, 2005 Dottorato in Informatica XIX Ciclo

2 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

3 DNADNA genegene mRNA protein Genes Machine Cell/Life

4 Microarray Data, conceptual model

5 Microarray Data Management Issues Exp. data vs. seq. data: Context dependent (living system, exp. Conditions) Lack of standard unit of measure Several normalizations methods Multiple platforms and methods No standard for data annotation Vocabularies and terminology coherence Details about: experiment, source, protocols, exp. conditions

6 Microarrays Data Management Issues / 2 Evidences about data quality What to store? Raw Images Computed values Normalized values How to find data Complex vocabularies aware systems (ontologies) Data mining and exp. comparison tools Data access control

7 MIAME Experiment Modeling

8 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

9 GCA Features Curated experimental design representation MIAME-compliant, (although with simplified model) Use of controlled vocabularies Experiment checking/publishing, with supervision Targeted to Affymetrix platform Chip description is simple, imported from NETAffx Single channel technology Access control Users are grouped into groups and access roles Experiments belong to user groups

10 GCA Features Data Retrieval and visualization Gene browser, a graphical visualization interface, based on the matrix model Search & Save data Current content: A set of time-courses about DCs stimulated with different stimuli Implementation & Deployment LAMP application (Linux + Apache + MySQL + PHP) Model Viewer Controller as much as possible: Business objects layer Presentation widgets (DAO-lib) Other application control layers

11 GCA Features Shortly: A Gene Expression database software, focused on Affymetrix technology, useful as a facility for a distributed community of users

12 GCA Data Model

13

14 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

15 GCA Login

16 GCA Editing

17 GCA Experiment Checking

18 GCA Import of chip annotations

19 GCA CVs and protocols

20

21 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

22 GCA Gene Browser

23

24

25

26

27 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

28 GCA Access Management

29 Bicocca Besta Granucci ADMIN Norman Tiranti Andrea Brandizi Experiment 123 Ottavio User Permissions Brandizi, AndreaAll GranucciRead NormanRead, Write TirantiAll (except admin) OttavioNone User Permissions Brandizi, AndreaAll GranucciRead NormanRead, Write TirantiAll (except admin) OttavioNone All but admin All rights R, W, -publish Read only

30

31

32

33

34 Access management Based on a core library Recent developments (security lib) Code has been changed so that it uses security lib All the code that interacts with user has been wrapped with access management controls Even malicious access attempts has been considered: Handy writing of an URL Handy request of an uploaded file (to be completed) Does it work? Yes, pretty sure But more testing is needed

35 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

36 Search and Save

37

38

39

40

41

42

43

44 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

45 MAGE Export Will allow to export a GCA experiment to MAGE/Array Express A collaboration with EBI in the context of u-GENE So far: Schema of GCA->MAGE (in AE compatible form) Basic code fragments (Business objects in Java) Still to do Full code Mappings with MGED-Ontology Tests with AE

46 MAGE Export

47 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

48 GCA on cluster architecture Three machines, the minimum to have a cluster Master (Xeon 3.2 Ghz, 2Gb RAM) + Master Clone that ensures high availability computation node computers (P4 3 Ghz, 512Mb) 1Tb of SCSI disk, shared via NFS Based on: Debian (Linux) Linux Virtual Server (Load Balancer) Hearthbeat (High availability)

49 GCA on cluster architecture

50 Code needs slight changes: PHP side and sessions: Objects that are saved on session need to be reloaded properly See: http://it2.php.net/manual/en/language.oop.magic- functions.php#14473http://it2.php.net/manual/en/language.oop.magic- functions.php#14473 __wakeup() is already used __sleep() with proper return value is to be implemented MySQL side: The stable DB: We need to specify the type of DB access: Read Only Mode vs. Read/Write mode RO access uses local copy of DB RW access uses master copy The temporary DB: Only master copy exists (3307 port, current deployment)

51 GCA on cluster architecture Possible other uses of cluster Heavy computations (normalizations) Integration with R (Grad. Thesis of L. Vanotti, Grad. Th. of M. Sesana) Other Integrations with R (AMDA) Other related services Knowledge management app. Groupware Integration (DC-Thera) Cytoscape Integration (DC-Thera) Service mgmt. app. R computations BUT it has been designed for GCA

52 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

53 The mA Experiments Cycle

54 “Closing the loop”

55 What we need to model

56

57

58 How to model: Semantic Web Technologies “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic WebThe Semantic Web, Scientific American, May 2001

59 Microarrays Annotation Ontology Microarray entities Annotation entities

60 Microarrays Annotation Ontology Annotation (source, target, child, parent, rank)

61 Microarrays Annotation Ontology

62

63 Examples of use Annotating a saved search Comments and answer to comments Originating operations (import, intersection, merge...) Which user is working on this data set Why the data set is being saved Functional family I'm studying IL2 I'm studying Shistosomia disease

64 Examples of use: AMDA

65 Representation an AMDA report in a structured form DEGs and genes clusters This is a DEG set, computed by AMDA (PAM method ) on samples s1, s2, s3 Correlation between chips (storing values and links to chip pairs) Functional annotations of genes, by means of KEGG (with reporting of significance) Import of analysis annotations on GCA Presenting analysis annotation together with data sets

66 Outline Introduction GCA Application Main features Demo Demo/Gene Browser Recent added features Access control Search & Save Ongoing and future MAGE Export Migration on cluster Management of knowledge about Higher Level Analysis Other possible developments

67 GCA: other possible developments Templates for experiment insertion Advanced CVs (taxonomies, mapping to MGED-Ontology) Knowledge management features (with or without annotation ontology) e-Groupware and links between eGroupware forums/documents and GCA experiments/data sets Integration with AMDA (with or without annotation ontology) Export of API, via Web Services Technology Integration with Taverna or Cytoscape Connections with pathway databases (ex.: by means of Pathway processor)

68 Thank you! marco.brandizi@unimib.it Find this presentation at: http://bioguest.btbs.unimib.it/~brandizi/


Download ppt "Genopolis Microarray DB a Progress Report Marco Brandizi Dec 12, 2005 Dottorato in Informatica XIX Ciclo."

Similar presentations


Ads by Google