Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis: Discovery of possible regulatory motifs What follows is a simulation of the proposed graphical interface. As you go through the simulation please.

Similar presentations


Presentation on theme: "Analysis: Discovery of possible regulatory motifs What follows is a simulation of the proposed graphical interface. As you go through the simulation please."— Presentation transcript:

1 Analysis: Discovery of possible regulatory motifs What follows is a simulation of the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests. A narrative to help you go through the simulation appears in a red-bordered box, such as the one below. To begin: 1. Click on Slide Show, (on the upper toolbar) 2. Click View Show 3. Click Continue button Continue Scenario 5

2 You’ve decided you want to know what regulates the expression of nif genes, encoding the machinery for nitrogen fixation. Here’s your strategy: Scenario 5 Continue (Search for other genes with same motifs) Analyze set of 5’ sequences for motifs Extract 5’ sequences from all genes in set Collect nif genes from Anabaena PCC 7120 into set Include in set orthologs of the Anabaena genes Analysis: Discovery of possible regulatory motifs

3 Build setDisplay setModify setSet operation Click on Build Set to begin finding orfs with the desired specifications

4 All items in All open reading frames of All amino acid sequences of All intergenic regions of Human-annotated orfs of Private set Public set All open reading frames of Build setDisplay setModify setSet operationCancel Choose set type The first goal is to find all open reading frames within Prochlorococcus annotated as nif genes, so click on All open reading frames in

5 All items in All open reading frames ofArthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH Synechocystis PCC 6803 Thermosynechococcus Trichodesmium Unicellulular Filamentous All Anabaena PCC 7120 Display setModify setSet operationCancel Choose set typeChoose database Build set Click on Anabaena PCC 7120

6 All items in Anabaena PCC 7120 Display setModify setSet operationCancel such that: VariableDataOperationFunctionDone Choose database Build set All open reading frames of Choose set type You want to compare the description of each orf with “nif”. To get a tool to extract the description, click on. Function

7 All items in Anabaena PCC 7120 Display setModify setSet operationCancel such that: VariableDataOperationFunctionDone Choose database Closest ortholog of Protein product of Upstream region of Downstream region of Description of Category of Annotation level of Description of Choose function (item Build set All open reading frames of Choose set type Click on Description of.

8 All items in Display setModify setSet operationCancel VariableDataOperationFunctionDone Description of Choose function (item) =  includes excludes includes Op Build set You want to find orfs whose description includes the word “nif”. Click on includes. Anabaena PCC 7120 such that: Choose database All open reading frames of Choose set type

9 All items in Display setModify setSet operationCancel DataOperationFunctionDone includes Op nif Type description term(s) Build set Description of Choose function (item) You can type in any characters to search for. For this simulation, the term “nif” is provided. Press the Enter key Anabaena PCC 7120 such that: Choose database All open reading frames of Choose set type

10 All items in Display setModify setSet operationCancel VariableDataOperationFunctionDone includes Op nif Type description term(s) Build set Description of Choose function (item) No more specifications. Press the Done button. Anabaena PCC 7120 such that: Choose database All open reading frames of Choose set type

11 All items in Display setModify setSet operationCancel VariableDataOperation Function Done includes Op nif Type description term(s) Build set Description of Choose function (item) Done Save results and script Save only results Save only results If this were a complicated search, you might want to save the specifications as a script. In this case, just save the results by clicking on Save only results. Anabaena PCC 7120 such that: Choose database All open reading frames of Choose set type

12 All items in Display setModify setSet operationCancel VariableDataOperation Function Done includes Op nif Type description term(s) Build set Description of Choose function (item) 7120 nif genes Type name of set Anabaena PCC 7120 such that: Choose database All open reading frames of Choose set type All orfs of Anabaena whose descriptions include “nif” will be collected into a set. You can name the set anything you want. For this simulation, a name is provided. Press the Enter key.

13 Build setDisplay setModify setSet operation Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, C terminus Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, N terminus Anab7120:all0688 hupS [NiFe] uptake hydrogenase small subunit Anab7120:alr0692 similar to nifU Anab7120:alr0874 nifH2 dinitrogenase reductase Anab7120:asr1309 similar to nifU Anab7120:alr1407 nifV1 homocitrate synthase Anab7120:asr1408 nifZ iron-sulfur cofactor synthesis Anab7120:asr1409 nifT Done Set: 7120 nif genes > This is the result of the search. The set is displayed both as a list of orfs and a graphical representation of the genetic neighborhood of each orf. You can find out more about an orf by clicking its name or its arrow. For now, just press. Continue

14 Build setDisplay setModify setSet operation Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, C terminus Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, N terminus Anab7120:all0688 hupS [NiFe] uptake hydrogenase small subunit Anab7120:alr0692 similar to nifU Anab7120:alr0874 nifH2 dinitrogenase reductase Anab7120:asr1309 similar to nifU Anab7120:alr1407 nifV1 homocitrate synthase Anab7120:asr1408 nifZ iron-sulfur cofactor synthesis Anab7120:asr1409 nifT Done Set: 7120 nif genes > This search, like most, is only a beginning. It brought up some unintended hits (“nif” found “NiFe”). More seriously, it brought up many genes probably in the middle of operons and unlikely to be preceded by regulatory motifs. The genetic neighborhood gives clues as to operon structure. Select the two most likely orfs to begin operons by clicking on the circles next to alr0874 and alr1407.

15 Build setDisplay setModify setSet operation Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, C terminus Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, N terminus Anab7120:all0688 hupS [NiFe] uptake hydrogenase small subunit Anab7120:alr0692 similar to nifU Anab7120:alr0874 nifH2 dinitrogenase reductase Anab7120:asr1309 similar to nifU Anab7120:alr1407 nifV1 homocitrate synthase Anab7120:asr1408 nifZ iron-sulfur cofactor synthesis Anab7120:asr1409 nifT Done Set: 7120 nif genes > Let’s suppose you proceed in a like fashion through the rest of the list. Press. Done

16 Build setDisplay setModify setSet operation Anab7120:alr0874 nifH2 dinitrogenase reductase Anab7120:alr1407 nifV1 homocitrate synthase Done Set: 7120 nif genes The set now consists of the six Anabaena nif genes that you judged most likely to be preceded by transcriptional signals. It might be interesting to see where this set is located on the genome. To do this, click, then make some room by clicking on Show graphic. Display set Anab7120:all1438 nifE nitrogenase Fe/Mo cofactor Anab7120:all1455 nifH dinitrogenase reductase Anab7120:all1517 nifB nitrogen fixation protein Anab7120:alr2968 nifV2 homocitrate synthase Display set Show orf ID Show gene name Show description Show coordinates Show graphic Show neighbors: +/- 1 Show map

17 Build setDisplay setModify setSet operation Anab7120:alr0874 nifH2 dinitrogenase reductase Anab7120:alr1407 nifV1 homocitrate synthase Done Set: 7120 nif genes Replace the space-consuming description with coordinates by clicking on Show description, and then click Show coordinates and finally Show map. Anab7120:all1438 nifE nitrogenase Fe/Mo cofactor Anab7120:all1455 nifH dinitrogenase reductase Anab7120:all1517 nifB nitrogen fixation protein Anab7120:alr2968 nifV2 homocitrate synthase Display set Show orf ID Show gene name Show description Show coordinates Show graphic Show neighbors: +/- 1 Show map

18 Build setDisplay setModify setSet operation Anab7120:alr0874 nifH2 Anab7120:alr1407 nifV1 Done Set: 7120 nif genes Anab7120:all1438 nifE Anab7120:all1455 nifH Anab7120:all1517 nifB Anab7120:alr2968 nifV2 Display set Show orf ID Show gene name Show description Show coordinates Show graphic Show neighbors: +/- 1 Show map Replace the space-consuming description with coordinates by clicking on Show description, and then click Show coordinates and finally Show map.

19 Anab7120:alr0874 nifH2 1008496 -> 1009389 Anab7120:alr1407 nifV1 1671878 -> 1673011 Anab7120:all1438 nifE 1696389 <- 1697831 Anab7120:all1455 nifH 1713396 <- 1714283 Anab7120:all1517 nifB 1776670 <- 1778097 Anab7120:alr2968 nifV2 3609625 -> 3611012 Build setDisplay setModify setSet operationDone Set: 7120 nif genes Replace the space-consuming description with coordinates by clicking on Show description and then Show coordinates, and finally, click on Show map. Display set Show orf ID Show gene name Show description Show coordinates Show graphic Show neighbors: +/- 1 Show map

20 Build setDisplay setModify setSet operationDone Anab7120:alr0874 nifH2 1008496 -> 1009389 Anab7120:alr1407 nifV1 1671878 -> 1673011 Set: 7120 nif genes Anab7120:all1438 nifE 1696389 <- 1697831 Anab7120:all1455 nifH 1713396 <- 1714283 Anab7120:all1517 nifB 1776670 <- 1778097 Anab7120:alr2968 nifV2 3609625 -> 3611012 Anabaena chromosome 6413771 bp Four of the six putative nif operons are clustered near 1.7 Mb... but back to business. Our idea was to extend the set to include orthologs in other nitrogen-fixing cyanobacteria. To do this, click, then Transformations, then Ortholog of. Set operation Maintenance Set operations Analysis tools Discovery tools Transformations Closest ortholog of Protein product of Upstream region of Downstream region of Ortholog of

21 Orthologs of ( Build setDisplay setModify setSet operationCancel All open reading frames of All amino acid sequences of All intergenic regions of Human-annotated orfs of Public set Private set Choose set type You want the orthologs of the orfs in the set you just made. This set is yours – a private set – as opposed to certain sets that are available to all users. Click Private set.

22 Orthologs of ( Build setDisplay setModify setSet operationCancel Private set Choose set type The list of choices will consist of whatever sets you may have created. Choose the one you just made: 7120 nif genes. 7120 IS895 seqs 7120 nif genes 7120 STTR7 regions Light-specific genes Npun STTR7 regions 7120 nif genes Choose set

23 Orthologs of ( Build setDisplay setModify setSet operationCancel Private set Choose set type At present, the set of filamentous cyanobacteria include just the nitrogen- fixing strains Nostoc punctiforme, Trichodesmium erythreum, Anabaena. Click on filamentous. 7120 nif genes Choose set Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Anabaena PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH8102 Synechocystis PCC 6803 Thermosynechococcus Trichodesmium erythreum Unicellulular Filamentous All filamentous Choose database in)

24 Orthologs of ( Build setDisplay setModify setSet operationCancel Private set Choose set type 7120 nif genes Choose set Filamentous Choose database in) all nif genes Type name of set All orthologs of the selected nif genes will be combined and saved in a set of your choice. For this simulation, a name is provided. Press the Enter key.

25 Build setDisplay setModify setSet operationDone Anab7120:alr0874 nifH2 dinitrogenase reductase Anab7120:alr1407 nifV1 homocitrate synthase Set: all nif genes Anab7120:all1438 nifE nitrogenase Fe/Mo cofactor Anab7120:all1455 nifH dinitrogenase reductase Anab7120:all1517 nifB nitrogen fixation protein Anab7120:alr2968 nifV2 homocitrate synthase NostPunc:637.025 nifH2 dinitrogenase reductase NostPunc:510.011 nifV1 homocitrate synthase NostPunc:651.072 nifE nitrogenase Fe/Mo cofactor NostPunc:510.021 nifB nitrogen fixation protein > The set now consists of nif genes from all filamentous cyanobacteria. From this set we want to extract the upstream sequences. Click on, then click on Transformations and Upstream region of. Set operation Ortholog of Protein product of Upstream region of Downstream region of Upstream region of Set operation Maintenance Set operations Analysis tools Discovery tools Transformations

26 Upstream region of ( Build setDisplay setModify setSet operationCancel All open reading frames of Human-annotated orfs of Public set Private set Choose set type Again you want the orfs from a set you made yourself, so click on Private set.

27 Upstream region of ( Build setDisplay setModify setSet operationCancel Private set Choose set type 7120 IS895 seqs 7120 nif genes 7120 STTR7 regions all nif genes Light-specific genes Npun STTR7 regions all nif genes Choose set ) The set you just defined magically appears on the list (no chance for misspelling). Click on it.

28 Upstream region of ( Build setDisplay setModify setSet operationCancel Private set Choose set type all nif genes Choose set ) Give this new set of 5’ regions a descriptive name (done here for you). Press the Enter key. all nif genes – 5’ Type name of set

29 Build setDisplay setModify setSet operationDone Anab7120.C:1006982-1008496d Anab7120.C:1671462-1671878d Set: all nif genes – 5’ Anab7120.C:1697832-1698138c Anab7120.C:1713264-1713395c Anab7120.C:1778098-1779034c Anab7120.C:3609273-3609624d NostPunc.637:37288-37376d NostPunc.510:15955-16325d NostPunc.651:60311-60584c NostPunc.510:5239-6338c > The resulting set consists of sequences not orfs, and so the elements are defined by coordinates. Clicking on a coordinate brings up the sequence display (see Scenario 6). Clicking on a graph of an orf brings up the orf’s annotation page. Click. Continue

30 Build setDisplay setModify setSet operationDone Anab7120.C:1006982-1008496d Anab7120.C:1671462-1671878d Set: all nif genes – 5’ Anab7120.C:1697832-1698138c Anab7120.C:1713264-1713395c Anab7120.C:1778098-1779034c Anab7120.C:3609273-3609624d NostPunc.637:37288-37376d NostPunc.510:15955-16325d NostPunc.651:60311-60584c NostPunc.510:5239-6338c > The final step in this procedure is to analyze the set of upstream sequences of nif genes hoping to find a common motif. Click on Set operatio, then Analysis tools. Tools based on Position-Specific Scoring Matrices (PSSM’s) are most often used for the task. Click on one of these: Meme. Set operation Maintenance Set operations Analysis tools Discovery tools Transformations Analysis tools Align PSSM: Gibbs sampler PSSM: Meme Make HMM PSSM: Meme Set operation

31 PSSM: Meme of ( Build setDisplay setModify setSet operationCancel Public set Private set Choose set type Click Private set and then all nif genes – 5’ to give Meme the set of 5’ sequences.

32 PSSM: Meme of ( Build setDisplay setModify setSet operationCancel Private set Choose set type Click Private set and then all nif genes – 5’ to give Meme the set of 5’ sequences. 7120 IS895 seqs 7120 nif genes 7120 STTR7 regions all nif genes all nif genes – 5’ Npun STTR7 regions all nif genes – 5’ Choose set )

33 PSSM: Meme of ( Build setDisplay setModify setSet operationCancel Private set Choose set type Give the results a name, press Enter, and the task is accomplished. all nif genes – 5’ Choose set ) PSSM:all nif – 5’ Type name of results

34 Analysis: Discovery of possible regulatory motifs Summary The interface facilitates operations on sets of genes and sequences The interface puts at your disposal powerful tools (that already exist), without the need to figure out a different computer environment Taken together, these capabilities make possible a focus by those not particularly adept at computer programming on the function of noncoding sequences Scenario 5 But don’t be fooled – the interface does not yet exist. That’s the point of the proposal!


Download ppt "Analysis: Discovery of possible regulatory motifs What follows is a simulation of the proposed graphical interface. As you go through the simulation please."

Similar presentations


Ads by Google