Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○

Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○ EMBL Nucleotide Sequence Database ○ DDBJ For Protein sequences ○ UniProtKB NCBI Reference Sequence (RefSeq)‏

Nucleotide sequence DB  The 3 databases form an international collaboration. Each of the three groups collects a portion of the total sequence data reported worldwide, and all new and updated database entries are exchanged between the groups on a daily basis.  You do not need to check all of them!

Nucleotide sequence DB

NCBI Entrez Present all the information available at NCBI for a gene. Entrez is a integrated searching tool across all the databases

Genome Browsers  NCBI Sequence Viewer  UCSC Genome Browser  ENSEMBL

NCBI Sequence Viewer This is an example view of the human beta globin region on chr11

UCSC Genome Browser

ENSEMBL

ENSEMBL – genome view

ENSEMBL – Gene tree

NCBI OMIM database  Nucleotide databases and Genome Browser provide information on the gene nucleotide sequence (exon, intron, alternative splicing sites…) but give you very few information on gene function  OMIM database provide a summary of all the literature concerning a gene.

NCBI OMIM database

Protein Databases  Protein databases provide useful information about the function of gene: e.g. conserved protein domains,…  UniProt is the reference database  Interpro offer automatic protein annotation based on conserved domains  RefSeq

Protein databases - UniProt

Similarity search  If your gene has no protein information  Protein sequence available BLASTP against a non redundant protein database  Protein sequence unavailable BLASTX against a non redundant protein database

Protein 3D structure  Many proteins have the 3D structure determined. Biggest databases are: PDB NCBI Structure Group Dali  They offer tools for the visualization

PDB database The visualization tools allows you to see the structure and the ligands (if presents), rotate the image and zoom-in

3D structure prediction  Structure still available for a limited number of proteins  Effort to predict protein structures based on sequences similarities  Still not very accurate!  SwissModel  PSIPRED  PredictProtein

Swiss-Model

Protein interaction databases  AIM: find proteins that interact with your target  IntAct: EBI resource to find interctors  BioGRID: is a freely available interaction database from model organisms and humans.

IntAct

Regulatory and metabolic pathways  the classic “KEGG”:KEGG

miRNA specific resources  Databases: miRNAMap: it present several useful information such as secondary structure, tissue specific expression and predicted target gene HMDD: is specific for disease-miRNA association MiRbase: is a searchable database of published miRNA sequences and annotation.  Target Prediction tools: miRecords: is a good repository that shows confirmed target genes and predictions from several other software

C. Elegans specific tools  WormBase: is the main resource of information on C. elegans.  Expression pattern databaseHope lab  Expression Pattern Database  The Nematode Expression Pattern DataBase  Caenorhabditis elegans Genetics and Genomics: provides links to many useful resources for C. elegans

Expression databases  Allows exploratory analyses of multiple experiments  Experiments need to be linked  Require much information about how experiments where conducted = sources of variation  Very different to genomic databases  MIAME standard

MIAME  Experimental design  Microarray design  Extraction, preparation and labelling  Hybridisation conditions  Measurements: images, quantifications, parameters  Systematic error adjustments and transformations

Gene Expression Omnibus  NCBI administered  ~280,000 samples  >100 organisms  >1,000,000,000 measurements

Gene Expression Omnibus

ArrayExpress  EBI administered  >7000 experiments  Provide p-values  Bioconductor package

ArrayExpress

GEO and ArrayExpress Databases provide:  The raw data for each hybridization (e.g., CEL or GPR files)  The final processed (normalized) data for the set of hybridizations in the experiment (study) (e.g., the gene expression data matrix used to draw the conclusions from the study)  The essential sample annotation including experimental factors and their values (e.g., compound and dose in a dose response experiment)  The experimental design including sample data relationships (e.g., which raw data file relates to which sample, which hybridizations are technical, which are biological replicates)  Sufficient annotation of the array (e.g., gene identifiers, genomic coordinates, probe oligonucleotide sequences or reference commercial array catalog number)  The essential laboratory and data processing protocols (e.g., what normalization method has been used to obtain the final processed data)

Problems:  Difficult compare experiments  Significant genes not highlighted  Poor results visualization  ArrayExpress is trying with its Atlas to solve this problems

Genevestigator  It is JAVA visualization tool that summarizes results from thousands of high quality transcriptomic experiments  Much easier to compare samples  Open access to only some of the data and 1 probeset/gene

Genevestigator

ONCOMINE

Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○

Similar presentations

Presentation on theme: "Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○

Similar presentations

Presentation on theme: "Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○"— Presentation transcript:

Similar presentations

About project

Feedback