Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Annotation for Gene Expression Analysis with Reactome.db Package Utah State University – Spring 2012 STAT 6570: Statistical Bioinformatics Cody Tramp.

Similar presentations


Presentation on theme: "1 Annotation for Gene Expression Analysis with Reactome.db Package Utah State University – Spring 2012 STAT 6570: Statistical Bioinformatics Cody Tramp."— Presentation transcript:

1 1 Annotation for Gene Expression Analysis with Reactome.db Package Utah State University – Spring 2012 STAT 6570: Statistical Bioinformatics Cody Tramp

2 References Ligtenberg W. 2011. Reactome.db: How to use the reactome.db package. www.reactome.org 2

3 Reactome.db Overview “Open souce, open access, manually curated, and peer-reviewed pathway database” – www.reactome.orgwww.reactome.org Reactome.db is an R interface that allows queries to the SQL database containing pathway information Contains functions for converting between annotation IDs and names for GO, Entrez, and Reactome 3

4 Getting Help on Specific Reactome.db Functions #Load the Reactome.db package library(reactome.db) #Check for main manual pages ?reactome.db #This won't get the actual manual #List all reactome.db objects ls("package:reactome.db") # [1] "reactome“ "reactome_dbconn“ "reactome_dbfile" # [4] "reactome_dbInfo“ "reactome_dbschema“ "reactomeEXTID2PATHID" # [7] "reactomeGO2REACTOMEID“ "reactomeMAPCOUNTS“ "reactomePATHID2EXTID" #[10] "reactomePATHID2NAME“ "reactomePATHNAME2ID“ "reactomeREACTOMEID2GO" #Look up specific manual for an object ?reactome_dbInfo #Still not very useful – poor documentation 4

5 How IDs and names are stored in Reactome.db The reactome.db links to a SQL database Functions are interfaces to the database SQL databases are relational databases (think of Excel spreedsheets, but better) Data is stored as key:value pairs 5 KeyValue 15869Homo sapiens: Metabolism of nucleotides 68616Homo sapiens: Assembly of the ORC complex at the origin of replication 68827Homo sapiens: CDC6 association with the ORC:origin complex 68867Homo sapiens: CDT1 association with the CDC6:ORC:origin complex 68874Homo sapiens: Assembly of the pre-replicative complex

6 Reactome.db Function Uses (NOTE: all return a key:value list) Converting Between Entrez and Reactome reactomeEXTID2PATHID = Entrez ID to Reactome.db ID reactomePATHID2EXTID = Reactome.db Name to Entrez ID 6 > xx <- toTable(reactomeEXTID2PATHID) > head(xx) reactome_id gene_id 1 168253 10898 2 168254 10898 3 168253 8106 4 168254 8106 5 168253 5610 6 168254 5610 Use toTable() instead of as.list() that is shown in manuals

7 Reactome.db Function Uses (NOTE: all return a key:value list) Converting from GO ID and Reactome ID reactomeREACTOMEID2GO = Reactome.db ID to GO IDs reactomeGO2REACTOMEID = GO ID to Reactome.db ID 7 > xx <- toTable(reactomeGO2REACTOMEID) > head(xx) reactome_id go_id 1 168276 GO:0019054 2 168276 GO:0019048 3 168276 GO:0044068 4 168276 GO:0022415 5 168276 GO:0051701 6 168276 GO:0044003

8 Reactome.db Function Uses (NOTE: all return a key:value list) Retrieving Pathway Names from Reactome IDS reactomePATHNAME2ID = Reactome.db Name to Reactome.db ID reactomePATHID2NAME = Reactome.db ID to Reactome.db Name 8 > xx <- toTable(reactomePATHID2NAME) > head(xx) reactome_id path_name 1 15869 Homo sapiens: Metabolism of nucleotides 2 68616 Homo sapiens: Assembly of the ORC complex at the origin of replication 3 68689 Homo sapiens: CDC6 association with the ORC:origin complex 4 68827 Homo sapiens: CDT1 association with the CDC6:ORC:origin complex 5 68867 Homo sapiens: Assembly of the pre-replicative complex 6 68874 Homo sapiens: M/G1 Transition

9 Reactome.db Function Uses (NOTE: all return a key:value list) reactomeMAPCOUNTS = shows number of rows in each function’s relational database (not very useful unless error checking) 9 > xx <- as.list(reactomeMAPCOUNTS) > xx $reactomeEXTID2PATHID [1] 28363 $reactomeGO2REACTOMEID [1] 3217 $reactomePATHID2EXTID [1] 8320 $reactomePATHID2NAME [1] 13778 $reactomePATHNAME2ID [1] 13876 $reactomeREACTOMEID2GO [1] 47575

10 Ex: Find apoptosis induction-related ID (compare to Notes 6.1 slide 10) # Get data.frame summarizing all reactome.db pathways including a certain string xx <- toTable(reactomePATHNAME2ID) all.pathways <- xx$path_name # get name of each reactome.db pathway t <- grep('apoptosis',all.Terms) # get index where Term includes #use agrep() for approximate term searching reactome.Term <- unlist(all.pathways[t]) reactome.IDs <- unlist(xx$reactome_id[t]) reactome.frame <- data.frame(reactome.ID=reactome.IDs, reactome.Term=reactome.Term) rownames(reactome.frame) <- 1:length(reactome.ID) reactome.frame # 13 terms 10

11 Ex: Find apoptosis induction-related ID (compare to Notes 6.1 slide 10) 11

12 Ex. Pathway Term Search Function ##Define Function to search for pathways with given key word ##agrep.bool is indicator to use agrep (TRUE) or grep (FALSE) searchPathways2REACTOMEID <- function(term, agrep.bool) { xx <- toTable(reactomePATHNAME2ID) all.pathways <- xx$path_name # get name of each reactome.db pathway #get index where Term is found if (agrep.bool==FALSE) (t <- grep(term, all.pathways)) else (t <- agrep(term, all.pathways)) unlist(xx$reactome_id[t]) } apop.IDs <- searchPathways2REACTOMEID("apoptosis", FALSE) length(apop.IDs) #13 pathways matched apop.IDs <- searchPathways2REACTOMEID("apoptosis", TRUE) length(apop.IDs) #85 pathways matched 12

13 Getting GO Terms from single Reactome ID 13 ##Get List of GO Terms from Reactome ID xx <- toTable(reactomeGO2REACTOMEID) t <- xx$reactome_id == "15869" GOTerms <- xx$go_id[t] > GOTerms [1] "GO:0055086" "GO:0006139" "GO:0044281" [4] "GO:0034641" "GO:0044238" "GO:0008152" [7] "GO:0006807" "GO:0044237" "GO:0008150" [10] "GO:0009987" > xx <- toTable(reactomeGO2REACTOMEID) > head(xx) reactome_id go_id 1 168276 GO:0019054 2 168276 GO:0019048 3 168276 GO:0044068 4 168276 GO:0022415 5 168276 GO:0051701 6 168276 GO:0044003

14 Getting GO Terms from list of Reactome IDs 14 ##Define Function to get all GO Terms for all Reactome IDs in a list getGOTerms <- function(list_reactome) { listGO = list(); xx <- toTable(reactomeGO2REACTOMEID); for(i in 1:length(list_reactome)) {t <- xx$reactome_id==list_reactome[i]; temp_list = xx$go_id[t] listGO = c(listGO, temp_list)} unlist(listGO) } GOTerms.all <- getGOTerms(apop.IDs)#From slide 10 length(GOTerms.all) #136 GO Terms from 13 apop.IDs Should have yielded 169 terms (Notes 4.1 slide 10) – reactome.db might not be complete

15 Reactome.org Online Tools 15

16 Pathway Viewer on reactome.org 16 http://www.reactome.org/userguide/Usersguide.html#Introduction

17 Pathway Viewer on reactome.org Details Panel 17

18 Pathway Viewer on reactome.org 18 http://www.reactome.org/entitylevelview/PathwayBrowser.html#DB=gk_current&FOCUS_SPECIES_ID=48887&FOCUS_PATHWAY_ID=71387&ID=76213&VID=3422142

19 Reactome Pathway Symbols 19 Upregulation and participating proteins Inhibition http://www.reactome.org/entitylevelview/PathwayBrowser.html#DB=gk_current&FOCUS_SPECIES_ID=48887&FOCUS_PATHWAY_ID=71387&ID=76213&VID=3422142

20 Reactome Database Assignment Method Genes seem to be assigned to pathways in a similar manner to GO database  If gene is up-regulated, it is included  Genes that are down-regulated in a condition are NOT mapped to the condition/pathway Haven’t received official response from reactome.org, but from general browsing this seems to be the case 20

21 Pathway Analysis Tool 21 http://www.reactome.org/ReactomeGWT/entrypoint.html#PathwayAnalysisDataUploadPage

22 Pathway Analysis Tool 22 http://www.reactome.org/ReactomeGWT/entrypoint.html#PathwayAnalysisDataUploadPage

23 Expression Set Data Analysis 23

24 Expression Set Data Analysis 24

25 Summary Reactome.db provides an interface to the SQL database containing IDs Functions for converting between ID types No functionality for gene testing through R Online tools include pathway maps and ID lookup tables Some limited expression testing (with unknown statistical methods) 25

26 Questions? 26


Download ppt "1 Annotation for Gene Expression Analysis with Reactome.db Package Utah State University – Spring 2012 STAT 6570: Statistical Bioinformatics Cody Tramp."

Similar presentations


Ads by Google