Presentation is loading. Please wait.

Presentation is loading. Please wait.

SPH 247 Statistical Analysis of Laboratory Data 1 May 12, 2015 SPH 247 Statistical Analysis of Laboratory Data.

Similar presentations


Presentation on theme: "SPH 247 Statistical Analysis of Laboratory Data 1 May 12, 2015 SPH 247 Statistical Analysis of Laboratory Data."— Presentation transcript:

1 SPH 247 Statistical Analysis of Laboratory Data 1 May 12, 2015 SPH 247 Statistical Analysis of Laboratory Data

2 Annotation Given that one has found one of more genes that are differentially expressed, there are a number useful things to know What is the putative function? What pathways are know to contain this gene? What other proteins interact with the given protein? etc. May 12, 2015BST 226 Statistical Methods for Bioinformatics2

3 Two-color array example May 12, 2015BST 226 Statistical Methods for Bioinformatics3 > alldata[1,] [1] 473 888 170 1137 86 290 109 226 370 659 359 484 102 293 174 [16] 324 196 638 102 293 > geneID[1,] Name ID 1 NM_006182 discoidin domain receptor family, member http://www.ncbi.nlm.nih.gov/genome/guide/human/resources.shtml

4 May 12, 2015BST 226 Statistical Methods for Bioinformatics4 Official SymbolDDR2 provided by HGNCHGNC Official Full Name discoidin domain receptor tyrosine kinase 2 provided by HGNCHGNC Primary sourceHGNC:2731HGNC:2731 Locus tagRP11-572K18.1 See relatedEnsembl:ENSG00000162733; HPRD:01868; MIM:191311; Vega:OTTHUMG00000034423Ensembl:ENSG00000162733;HPRD:01868;MIM:191311;Vega:OTTHUMG00000034423 Gene typeprotein coding RefSeq statusREVIEWED OrganismHomo sapiensHomo sapiens LineageEukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo Also known asTKT; MIG20a; NTRKR3; TYRO10 SummaryReceptor tyrosine kinases (RTKs) play a key role in the communication of cells with their microenvironment. These molecules are involved in the regulation of cell growth, differentiation, and metabolism. In several cases the biochemical mechanism by which RTKs transduce signals across the membrane has been shown to be ligand induced receptor oligomerization and subsequent intracellular phosphorylation. This autophosphorylation leads to phosphorylation of cytosolic targets as well as association with other molecules, which are involved in pleiotropic effects of signal transduction. RTKs have a tripartite structure with extracellular, transmembrane, and cytoplasmic regions. This gene encodes a member of a novel subclass of RTKs and contains a distinct extracellular region encompassing a factor VIII-like domain. Alternative splicing in the 5' UTR results in multiple transcript variants encoding the same protein. [provided by RefSeq, Jul 2008]

5 Affy Example May 12, 2015BST 226 Statistical Methods for Bioinformatics5 > source("http://bioconductor.org/biocLite.R") > biocLite("annaffy") > biocLite("hgu95av2.db") > library(annaffy) > library(affy) Loading required package: Biobase Loading required package: tools … Loading required package: GO Loading required package: KEGG

6 May 12, 2015BST 226 Statistical Methods for Bioinformatics6 > probeids <- featureNames(eset)[allp2.adj <.05] > probeids[1:5] [1] "1005_at" "1009_at" "1034_at" "1035_g_at" "1045_s_at" > symbols <- aafSymbol(probeids,"hgu95av2.db") Loading required package: hgu95av2 > symbols[1] An object of class "aafList" [[1]] An object of class “aafSymbol” [1] "DUSP1" > getText(symbols[1]) [1] "DUSP1" > descs <- aafDescription(probeids,"hgu95av2.db")[1] > getText(descs)[1] [1] "dual specificity phosphatase 1" > gos <- aafGO(probeids,"hgu95av2.db")

7 May 12, 2015BST 226 Statistical Methods for Bioinformatics7 > gos[1] An object of class "aafList" [[1]] An object of class "aafGO" [[1]][[1]] An object of class "aafGOItem" @id "GO:0006470" @name "protein amino acid dephosphorylation" @type "Biological Process" @evid "IEA" [[1]][[2]] An object of class "aafGOItem" @id "GO:0006979" @name "response to oxidative stress" @type "Biological Process" @evid "TAS" [[1]][[3]] An object of class "aafGOItem" @id "GO:0007049" @name "cell cycle" @type "Biological Process" @evid "IEA”

8 May 12, 2015BST 226 Statistical Methods for Bioinformatics8 [[1]][[4]] An object of class "aafGOItem" @id "GO:0004726" @name "non-membrane spanning protein tyrosine phosphatase activity" @type "Molecular Function" @evid "TAS" [[1]][[5]] An object of class "aafGOItem" @id "GO:0005515" @name "protein binding" @type "Molecular Function" @evid "IPI" [[1]][[6]] An object of class "aafGOItem" @id "GO:0016787" @name "hydrolase activity" @type "Molecular Function" @evid "IEA"

9 May 12, 2015BST 226 Statistical Methods for Bioinformatics9 [[1]][[7]] An object of class "aafGOItem" @id "GO:0017017" @name "MAP kinase tyrosine/serine/threonine phosphatase activity" @type "Molecular Function" @evid "IEA" There are actually 33 terms

10 GO Evidence Codes IEA = inferred from electronic annotation (e.g., BLAST). Uncurated TAS = traceable author statement (i.e., someone said so). May 12, 2015BST 226 Statistical Methods for Bioinformatics10

11 IDA = inferred from direct assay IEP = inferred from expression pattern IGI = inferred from genetic interaction IMP = inferred from mutant phenotype IPI = inferred from physical interaction ISS = inferred from sequence similarity NAS = non-traceable author statement ND = no biological data available NR = not recorded May 12, 2015BST 226 Statistical Methods for Bioinformatics11

12 Online Access > gbs <- aafGenBank(probeids,"hgu95av2.db") > getURL(gbs[[1]]) [1] "http://www.ncbi.nlm.nih.gov/entrez/query. fcgi?cmd=search&db=nucleotide&term=X68277% 5BACCN%5D&doptcmdl=GenBank" > lls <- aafLocusLink(probeids,"hgu95av2.db") > getURL(lls[[1]]) [1] "http://www.ncbi.nlm.nih.gov/sites/entrez? Db=gene&Cmd=DetailsSearch&Term=1843" May 12, 2015BST 226 Statistical Methods for Bioinformatics12

13 Abstracts > pmids <- aafPubMed(probeids,"hgu95av2.db") > pmids[[1]] An object of class "aafPubMed" [1] 15111304 14764702 14500648 12935821 12477932 9659899 8977099 8796349 8682498 8622863 8390675 8302582 [13] 8226977 7848919 7789998 7774938 > pmids[1] An object of class “aafPubMed” [1] 1406996 7535770 7806236 8106404 8168826 8221888 8389479 8390041 [9] 9571625 9599409 10617468 11062068 11278799 12080474 12356755 12391149 [17] 12432554 12477932 12506119 12765304 12890671 12947325 12960255 14551204 [25] 14680833 14702039 14724291 15059515 15173070 15247770 15339908 15448190 [33] 15489334 15569826 15590693 15614136 15677475 16044158 16081065 16224818 [41] 16286470 16289033 16293973 16387640 17073741 17131384 17489738 > browseURL(getURL(lls[[1]])) May 12, 2015BST 226 Statistical Methods for Bioinformatics13

14 Abstracts > pmids <- aafPubMed(probeids,"hgu95av2.db") > pmids[[1]] An object of class "aafPubMed" [1] 1406996 7535770 7593328 7806236 8106404 8168826 8221888 8389479 [9] 8390041 9571625 9599409 10022904 10617468 11062068 11278799 11359773 [17] 12080474 12356755 12391149 12432554 12477932 12506119 12676937 12765304 [25] 12890671 12947325 12960255 14551204 14680833 14702039 14724291 15059515 [33] 15173070 15247770 15339908 15448190 15489334 15569826 15590693 15614136 [41] 15677475 16044158 16081065 16109884 16224818 16286470 16289033 16293973 [49] 16387640 16951204 17073741 17131384 17489738 17560945 17638884 17681939 [57] 17690186 17761948 17848570 18003751 18089824 18095520 18178562 18314542 [65] 18367666 18403641 18434324 18441094 18477563 18490444 18519678 18723442 [73] 18726921 18782768 19020052 19032224 19117950 19262425 19411256 19417026 [81] 19553005 19604093 19648110 19697705 19724859 19755862 19793766 19797979 [89] 19913121 19940143 20043958 20089808 20100830 20145951 20228065 20353815 [97] 20375469 20555314 20628086 20672047 20673984 20689807 20708668 20803086 [105] 20805296 20829434 20868379 20890299 20936779 20953200 21060794 21454676 [113] 21507959 21547253 21610072 21715333 21733716 21803700 21840882 21906983 [121] 21959016 21960346 21963094 21987572 21988832 22014408 22020934 22200679 [129] 22301548 22505724 22610099 22743041 22797925 22924482 22991462 23030431 [137] 23097457 23149933 23169297 23333304 23625220 23728617 23825193 24080497 [145] 24086554 24155659 24267255 24308939 24339725 24578378 24658355 24692548 [153] 25204653 25312268 25370852 25377473 25391492 May 12, 2015BST 226 Statistical Methods for Bioinformatics14

15 Direct Browsing > browseURL(getURL(lls[[1]])) > browseURL(getURL(gbs[[1]])) > browseURL(getURL(pmids[1])) May 12, 2015BST 226 Statistical Methods for Bioinformatics15

16 Top Genes > probeids.ord <- featureNames(eset)[order(allp2.adj)] > getText(aafSymbol(probeids.ord[1:10],"hgu95av2.db")) [1] "S100A2" "" "RPLP1" "GM2A" "" "RPS17" "GAPDH" "COPA" [9] "PSPHP1" "" > getText(aafDescription(probeids.ord[1:10],"hgu95av2.db")) [1] "S100 calcium binding protein A2" [2] "" [3] "ribosomal protein, large, P1" [4] "GM2 ganglioside activator" [5] "" [6] "ribosomal protein S17" [7] "glyceraldehyde-3-phosphate dehydrogenase" [8] "coatomer protein complex, subunit alpha" [9] "phosphoserine phosphatase pseudogene 1" [10] "" May 12, 2015BST 226 Statistical Methods for Bioinformatics16

17 > aafGO(probeids.ord[7],"hgu95av2.db") An object of class "aafGOItem" @id "GO:0000226" @name "microtubule cytoskeleton organization" @type "Biological Process" @evid "ISS" [[1]][[2]] An object of class "aafGOItem" @id "GO:0004365" @name "glyceraldehyde-3-phosphate dehydrogenase (NAD+) (phosphorylating) activity" @type "Molecular Function" @evid "ISS" [[1]][[3]] An object of class "aafGOItem" @id "GO:0004365" @name "glyceraldehyde-3-phosphate dehydrogenase (NAD+) (phosphorylating) activity" @type "Molecular Function" @evid "NAS" [[1]][[4]] An object of class "aafGOItem" @id "GO:0005515" @name "protein binding" @type "Molecular Function" @evid "IPI" May 12, 2015BST 226 Statistical Methods for Bioinformatics17 There are actually 41 terms


Download ppt "SPH 247 Statistical Analysis of Laboratory Data 1 May 12, 2015 SPH 247 Statistical Analysis of Laboratory Data."

Similar presentations


Ads by Google