Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○

Similar presentations


Presentation on theme: "Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○"— Presentation transcript:

1

2 Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○ EMBL Nucleotide Sequence Database ○ DDBJ For Protein sequences ○ UniProtKB NCBI Reference Sequence (RefSeq)‏

3 Nucleotide sequence DB  The 3 databases form an international collaboration. Each of the three groups collects a portion of the total sequence data reported worldwide, and all new and updated database entries are exchanged between the groups on a daily basis.  You do not need to check all of them!

4 Nucleotide sequence DB

5

6

7

8

9 NCBI Entrez Present all the information available at NCBI for a gene. Entrez is a integrated searching tool across all the databases

10 Genome Browsers  NCBI Sequence Viewer  UCSC Genome Browser  ENSEMBL

11 NCBI Sequence Viewer This is an example view of the human beta globin region on chr11

12 UCSC Genome Browser

13 ENSEMBL

14 ENSEMBL – genome view

15 ENSEMBL – Gene tree

16 NCBI OMIM database  Nucleotide databases and Genome Browser provide information on the gene nucleotide sequence (exon, intron, alternative splicing sites…) but give you very few information on gene function  OMIM database provide a summary of all the literature concerning a gene.

17 NCBI OMIM database

18 Protein Databases  Protein databases provide useful information about the function of gene: e.g. conserved protein domains,…  UniProt is the reference database  Interpro offer automatic protein annotation based on conserved domains  RefSeq

19 Protein databases - UniProt

20

21

22

23 Similarity search  If your gene has no protein information  Protein sequence available BLASTP against a non redundant protein database  Protein sequence unavailable BLASTX against a non redundant protein database

24 Protein 3D structure  Many proteins have the 3D structure determined. Biggest databases are: PDB NCBI Structure Group Dali  They offer tools for the visualization

25 PDB database The visualization tools allows you to see the structure and the ligands (if presents), rotate the image and zoom-in

26 3D structure prediction  Structure still available for a limited number of proteins  Effort to predict protein structures based on sequences similarities  Still not very accurate!  SwissModel  PSIPRED  PredictProtein

27 Swiss-Model

28 Protein interaction databases  AIM: find proteins that interact with your target  IntAct: EBI resource to find interctors  BioGRID: is a freely available interaction database from model organisms and humans.

29 IntAct

30 Regulatory and metabolic pathways  the classic “KEGG”:KEGG

31 miRNA specific resources  Databases: miRNAMap: it present several useful information such as secondary structure, tissue specific expression and predicted target gene HMDD: is specific for disease-miRNA association MiRbase: is a searchable database of published miRNA sequences and annotation.  Target Prediction tools: miRecords: is a good repository that shows confirmed target genes and predictions from several other software

32 C. Elegans specific tools  WormBase: is the main resource of information on C. elegans.  Expression pattern databaseHope lab  Expression Pattern Database  The Nematode Expression Pattern DataBase  Caenorhabditis elegans Genetics and Genomics: provides links to many useful resources for C. elegans

33 Expression databases  Allows exploratory analyses of multiple experiments  Experiments need to be linked  Require much information about how experiments where conducted = sources of variation  Very different to genomic databases  MIAME standard

34 MIAME  Experimental design  Microarray design  Extraction, preparation and labelling  Hybridisation conditions  Measurements: images, quantifications, parameters  Systematic error adjustments and transformations

35 MIAME

36 Gene Expression Omnibus  NCBI administered  ~280,000 samples  >100 organisms  >1,000,000,000 measurements

37 Gene Expression Omnibus

38

39

40

41

42 ArrayExpress  EBI administered  >7000 experiments  Provide p-values  Bioconductor package

43 ArrayExpress

44

45

46

47

48 GEO and ArrayExpress Databases provide:  The raw data for each hybridization (e.g., CEL or GPR files)  The final processed (normalized) data for the set of hybridizations in the experiment (study) (e.g., the gene expression data matrix used to draw the conclusions from the study)  The essential sample annotation including experimental factors and their values (e.g., compound and dose in a dose response experiment)  The experimental design including sample data relationships (e.g., which raw data file relates to which sample, which hybridizations are technical, which are biological replicates)  Sufficient annotation of the array (e.g., gene identifiers, genomic coordinates, probe oligonucleotide sequences or reference commercial array catalog number)  The essential laboratory and data processing protocols (e.g., what normalization method has been used to obtain the final processed data)

49 Problems:  Difficult compare experiments  Significant genes not highlighted  Poor results visualization  ArrayExpress is trying with its Atlas to solve this problems

50 Genevestigator  It is JAVA visualization tool that summarizes results from thousands of high quality transcriptomic experiments  Much easier to compare samples  Open access to only some of the data and 1 probeset/gene

51 Genevestigator

52 ONCOMINE


Download ppt "Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○"

Similar presentations


Ads by Google