2? Too many matches… A typical question: A typical approach: What are the potential TF sites involved in regulationof my gene of interest ?A typical approach:“Let´s run MatInspector over the promoter region of my gene”
3? Too many matches… A typical question: A typical approach: Where do I get my input promoter DNA sequence from?A typical approach:“Let´s extract from NCBI. 3kb upstream of TSS to be sureto have the promoter…”
4? Too many matches… A typical result: Which of those matches are relevant?How do I get rid of all those “false positives” ?
5TF binding sites… Important facts to consider: There is not a single false positive matchMatInspector gives you all physical TF binding sitesA physical TFBS is found every 10 to 15 bps throughout the genomeA single isolated TF binding site carries no functionTFs work through complexes which are represented on sequencelevel through sets of TF binding sites in certain distance relationshipand orientation ->promoter frameworks
6? TF binding sites… Okay, what is now a physical TF binding site ? What is a functional TF binding site?
7Physical binding sites have no function in transcription on their own False positives?A physical binding site is invariableA physical binding site is a fixed part of the genome= weight matrix / IUPAC stringPhysical binding sites can be detected by MatInspectorThis DNA sequence usually can bind to its cognate protein(s)Physical binding sites have no function in transcription on their own
8Physical vs functional TFBS A functional binding site depends on context!A functional binding site requires a cellular contextOne binding site,five cell types......but binding proteins arepresent only in 2 cell types!-> no functional binding sitein the other 3 cell types!A functional binding site requires a genomic context...biological functionmay require additionalbinding sites!Even whenbinding proteinsare present...ModuleTranscriptional function is defined by the cellular and genomic context
9The core promoter - just another module Transcriptional modulesA transcriptional module is the smallest functional unitA transcriptional module consists of two or more TFBSsStrand orientation, relative order and distance of TFBSs are importantA module also has a strand orientation and can shift within a promoterTranscriptional modules are present in promoters and enhancersTATAboxINRThe core promoter - just another moduleF1 +F2 -F3 +/-Transcriptional modules integrate signals via the interacting TFs
10Why uses nature modules? BCABCABCNo common organization?Common modules!
11Transcriptional modules Promoter modules can work in three different waysSynergisticAntagonisticSynergistic“Short range module”distance ≤ 50 bp“Compositeelements”“Short range module”distance ≤ 50 bp“Looping module”distance up to 300bpororBindingAffinity:High / LowIs possibleHigh / LowIs possibleHigh / LowIs possibleHigh / Highonly
12Modules are the basic elements of regulatory pathways and networks Transcriptional modulesTranscriptional modules define target genes of pathwaysNFkappaB is involved in regulation of target genes of several pathwaysNFkappaB regulates a number of “target genes”NFkB CREBNFkB C/EBPCREBC/EBPIL-6IL-8ICAM-1SAA-1SAA-2ELAM-1IFN-ßIP-10G-CSFIL-2HLA-AHLA-BIL-1E-SelectinIRF-1NFkappaBNFkappaBInduced by 2 pathways !NFkB IRF-1NFkB NFkBModules are the basic elements of regulatory pathways and networks
14Transcription regulation implies a regulatory network Transcriptional modulesTranscription regulation mechanismPromoterExonGene A, transcript nPrimary transcriptProtein complexGene C,transcript mGene B,transcript pTranscription regulation implies a regulatory network
15Same lock – different keys: Same gene - different biological context Transcriptional modulesContext dependent expression by different protein complexesTFIIBTFIIETFIIHTFIIDTFIIFTFIIATBPTATATFIIBTFIIETFIIHINRTFIIDTFIIFTFIIATBPTATASame lock – different keys: Same gene - different biological context
16Transcriptional modules Context specific transcription regulationExample: Analysis of the RANTES promoter in different cell linesExperimentally verified evidence that TFBSs from modules, which are crucial for regulation inone biological context (cell type),are totally irrelevant in another !Fessele, S., Maier, H., Zischek, C., Nelson, P.J., Werner, T. (2002) "Regulatory context is a crucial part of gene function" Trends in Genetics 18, (MEDLINE )
17Module matches reduce experimental efforts by orders of magnitude Transcriptional modulesModules contribute strongly to functional promoter analysisModules are usually linked to at least one known biological functionA module match in a promoter makes this gene a good candidateA module match in a promoter does not prove the gene to be a targetAdditional independent evidence is required to prove the targetA module match immediately suggests experimental verificationModule matches reduce experimental efforts by orders of magnitude
18Promoter sequences?Very interesting – but how does all this help me with my original question ?The question still is:What are the potential TF sites involved in regulation of my gene of interest ?
19? Promoter sequences More things to consider before asking that question !There was another one:?Where do I get my input promoter DNA sequence from?“Let´s extract from NCBI. 3kb upstream of TSS to be sureto have the promoter…”
20? Promoter sequences More things to consider … 3 kb is too large for meaningful analysiseven going 10kb upstream of TSS is no guarantee to havethe relevant promoter sequencemultiple promoters are the rule, not an exeptionthe non-coding first exon is always part of the promoter?Huh? What does this mean ?Where do I get this damn promoter now?
21Genes usually have alternative transcripts with alternative promoters Alternative transcripts/promotersWhich promoter? One gene = one promoter ?Gene A?Gene A?Gene A?Genes usually have alternative transcripts with alternative promoters
22Alternative transcripts/promoters Context dependent expression via different promotersExample: GlucokinaseCoding exonsHepaticpromoterPancreaticpromoterY Tanizawa, A Matsutani, KC Chiu, and MA Permutt Human glucokinase gene: isolation, structural characterization, and identification of a microsatellite repeat polymorphism Mol. Endocrinol., Jul 1992; 6:
23Alternative transcripts/promoters Comparative genomic map of the Glucokinase GCKPromoter set 1PancreaticpromoterPromoter set 2HepaticpromoterData from ElDorado
24Alternative transcripts/promoters Important facts to consider:Alternative promoter usage is often tied to regulation oftissue specific gene expressionAlternative promoter usage is of very high biological relevance.There are several examples where aberrant regulation of theidentical primary transcript leads to severe biological effects
25Alternative transcripts/promoters Aromatase: Switch in promoter usage is associated with disease1.11.41.f1.61.31IIIIIIVVVIVIIVIIIIXXAATAAANormal breastBreast cancerAromataseThe gene product is absolutely identical. The only difference is in thealternative promoter usage. On transcript level this can be seen onlyin the non-coding first exon.
26Promoter Analysis The aim of in silico promoter analysis - summary 1. Identification of thepromoter sequencecontext 1context 2context 3:context n2. Prediction of physicaltranscription factorbinding sites3. Functional context4. Context dependentfunctional transcriptionfactor binding sites
27… www.genomatix.de ElDorado promoter sequence retrieval Yes! I know all of this! I just wanted to know from where I can get my promoter sequence(s) easily!If you don´t have one already, sign up for a free evaluation account. first...... then login here!
29ElDorado promoter sequence retrieval Choose the organism.Either enter here the locus ID, or the gene name…or choose a sequence file from your directory...… or copy & paste a raw sequence here. It can be cdNAor whatever you have. It will be exactly mapped to thegenomes within seconds.Upload a file from your local disk…...accession number…… or exact contig position
30ElDorado promoter sequence retrieval IMPORTANT!Affymetrix probe-set-ID input :Our annotation is NOT based on the Affymetrix NetAffx assignment!It is rather based on genomic mapping of each single probe.A transcript will be retrieved if at least one probe of the set (usually 11 probes) matches.For mixed probe sets (cross-hybridisation), all relevant transcripts will be retrieved, which might lead to a result with transcripts from different loci.Input in this section delivers results based on gene name or keyword search. Over a million of names, synonyms and gene IDs help to find what you want - fast!HMGCS1 ( for example)Input in this section delivers results based on ultra fast sequence mapping. Copy and paste raw sequence data here (min.15 nucleotides) or enter an accession number.In contrast to the entry of an accession number above, here the sequence is actually retrived from data base and mapped onto the genome(s).NOTE: many EST based accession numbers have poor sequence homology and deliver no result.
31ElDorado promoter sequence retrieval … here you can choose which chip´s probes to see...… licensed customers can add their own sequence data
32ElDorado promoter sequence retrieval This gives you an interactivegraphical representation of thegenomic context of your gene
33ElDorado promoter sequence retrieval switch display of components on and offmapping positions of Affymetrix single probes !scale/slide the retrieved genomic "window"select regions of the graphics and safe them into a fileOrange indicates your input. In this case a gene name. It is very informative when your query is based on sequence data. Then you see the mapping positions.Everything is clickable –just play around !Here you can scale the view
34ElDorado promoter sequence retrieval Clicking on this trancriptional start region (TSR)......displays this hyperlink to ...Now we have zoomed into the promoter region
35ElDorado promoter sequence retrieval ...this profile of the different experimentally verified TSS (CAGE tags) in the different tissue types.
36ElDorado promoter sequence retrieval This is a table-like representation of all annotated elements. It is especially useful for quick and easy retrieval of the dna sequence(s) of interest.
37ElDorado promoter sequence retrieval Tick/un-tick the boxes of what you would like to see, and then...
38ElDorado promoter sequence retrieval This for instance......tells you that this SNP deletes three potential TF binding sites and creates a new one. A potential regulatory active SNP...
39ElDorado promoter sequence retrieval from here you can directly run aMatInspector analysis forthis promoter......again,play around with theinteractive graphics...Click the symbols and jump right into MatBase, the TF knowledge base..
40ElDorado promoter sequence retrieval now, finally the first way to extract a promoter sequence ......and/or any other element displayed in the list below.Choose your desired length.Unless you have good reason to change the length of the proximal promoter, leave the defaults!
41ElDorado promoter sequence retrieval This shows you all annotated alternative transcripts plus all Affymetrix probe setsingle probe mappingsplusanother way to extract yourpromoter sequence(s)
42ElDorado promoter sequence retrieval You know this already...Three different known transcripts for this locus...... and four distinct promoters !How this comes, I´ll tell you in a minute
43ElDorado promoter sequence retrieval Tick the promoter of your interest...Or submit sequences directly to one of those tasks.But they make sense only with multiple sequences. More on that later!Or submit the promoter directly to MatInspectorfor graphical analysis.It works on a single sequence, too....choose format......and extract the sequence.
44? ElDorado promoter sequence retrieval But why do I have four promoters here?And two even don´t have a transcript assigned, as it is written here!And what´s all thatCompGen thing about?The multiple promoter thingI showed you before.Remember the GCK example, liver and pancreas?Now to the CompGen promoters.They are derived by a proprietary comparative genomics approach.
46ElDorado promoter sequence retrieval The tick-boxes you know already...We need them for later promoter retrieval.For our example we have an homologous locus assigned inchimp, macaca, human, rat, dog, cow, opossum, chicken, and zebrafish.Note the Promoter Set number !Exhaustive cross-mapping ofall transcriptstoall genomesofall organismsin ElDorado generates our homology groups.
47ElDorado promoter sequence retrieval Get a feeling for the degree of phylogenetic conservation of the resp. promoter.See how much experimental evidence supports this promoter
48ElDorado promoter sequence retrieval Promoter Set represents phylogenetically conserved promotersYou should be familiar with this view, now.Here the orange indicates a promoter belonging to a promoter set.With these tick-boxes you can switch on and off the display of the different Promoter Sets
49? ElDorado promoter sequence retrieval Don´t waste my time here! How do I get my promoter sequence now?And which one of all those promoters should I take ?Well, which one? If you do not have any other information(experimental or from literature),I would recommend that youconsider all available alternative promoters for further analysis
50? ElDorado promoter sequence retrieval Don´t waste my time here! How do I get my promoter sequence now?And which one of all those promoters should I take ?Two easy ways of promoter sequence retrieval by two mouse clicks I showed you some minutes ago.There are more...oh... you cannot access these options?
51ElDorado promoter sequence retrieval You should license GenomatixSuite with at leastthe 10-fold evaluation account upgrade.Otherwise it is slightly more cumbersome...... and use that for sequence retrieval from your second to Genomatix favorite system, e.g. NCBIUse one of the options I showed you before and get Contig and positional information...Hint: If you are interested in the TF results rather than the sequence, use the“search for common transcription factor binding sites” option as shown before.
52? From physical to functional TF site Quite interesting… But I am not a single step closer to the answer of my real question:What are the potential TF sites involved in regulation of my gene of interest ?Well, I think you are. Essential first step is to analyze the right sequence in a length that allows for meaningful results.Now that you have the real promoter sequence(s), let´s see how to go on from here...
53? From physical to functional TF site Then we have to look for additional evidence that some of the physical TF sites might be functional ones.Best would be to go for a ChromatinIP experiment. However, for such you would need some hints for which TF to make or buy antibodies. Further computer analysis is required anyhow!There are three different roads to go...The ideal situation for determining potential functional binding sites would be to have a set of genes apparently being co-regulated in the given cellular and experimental context, f.i. from a microarray experiment.A comparative promoter analysis with FrameWorker would very likely give you a pattern of involved TFs, as shown in numerous publications (see our web site at “About us -> Publications”).?But I have only a single gene.And that´s the one I am interested in!
54? From physical to functional TF site We talked about promoter modules before. Search your sequence for promoter modules with ModelInspector.Our Promoter Module Library contains over 550 promoter modules, each of them experimentally verified to carry transcriptional regulatory activity. A module match increases probability that an involved TF site is functional.?Okay, how do I do this?Let´s go !Look for phylogenetically conserved patterns of TF sites in a comparative genomics promoter set with FrameWorker.TFs being part of such phylogenetically conserved frameworks carry higher probability for being functional.Do extensive literature data mining with BiblioSpherePE for known TF correlations, pathway analysis and gene set creation for comparative promoter analysis.TFs showing biological activity in another experimental context are functional (at least in that context).
55ElDorado promoter sequence retrieval Lets start with an analysis for promoter modules...
56Search for promoter modules If you are licensed, you can have a quick look at the promoter module library. Each module is experimentally verified to carry regulatory activity.
57Search for promoter modules Choose a sequence file from your directoryOr copy & paste a raw sequence here.or… you know the rest !Don´t click anything below, unless you want to scan an entire data base !
58Search for promoter modules go for vertebrate modules...Click here! You can wait for the result…
61… Search for promoter modules Wow! That´s impressive! Now we have focused down to 21 very interesting positions in this promoter with modules that are composed of a total of 11 different transcription factor binding sites.Our arbitrary chosen example HMGCS1 belongs to the cholesterol biosynthesis pathway. Some of the found promoter modules do have proven function in sterol regulation!…Wow! That´s impressive!But that example is a mock-up, isn´t it?Not really. It is a nice example to show this approach. Very frequently one finds functionally related modules.However, there is no guarantee…It adds just another line of evidence.
62? Phylogenetically conserved frameworks That´s right. For this approach you first need a set of phylogenetically conserved promoters.Remember several slides before ??Okay, how does the other thing help?How did you call it,phylogenetically conserved frameworks?Not really. It is a nice example to show this approach. Very frequently one finds functionally related modules.However, there is no guarantee…It adds just another line of evidence.
63ElDorado promoter sequence retrieval and tick the promoters of one set.In this example I choose Promoter Set 3for human, rat, dog and cow.Inspect and choose your Promoter Set......scroll to the top of the page...
64… Phylogenetically conserved frameworks Great ! ...scroll down...…Great !That is what I really want to know:Which TF sites do they have in common?From here you can have a look at TF binding sites which are common to the input promoters
65… Phylogenetically conserved frameworks Be careful !! Great ! That is what I really want to know:Which TF sites do they have in common?This is not more than a tiny hint!I can show you many cases where totally unrelated exons do have more TF sites in common than closely co-regulated promoters.What you are really looking for is a conserved pattern of TF sites. And we are going to do so.But first let´s have a look on the nucleotide sequence level...Be careful !!
66Phylogenetically conserved frameworks DiAlign TF gives an overlay of a true multiple sequence alignment (not pairwise) andcommon TF sites.Check DiAlign for other sequences(including amino acids)!It is extremely fast and especially powerful for finding short homologies in largely unrelated sequences.
67Phylogenetically conserved frameworks The parameters should be selfexplanatory. Youcan always click for help
68Phylogenetically conserved frameworks Here an output example.
69? Phylogenetically conserved frameworks Why did you do this? What does it tell me?It is pretty informative to get a feeling for the degree of homology, which parts are more conserved than others and which TF binding sites reside in the homologous parts.Then, it is of interest to see where the evolutionary pressure was rather on functional conservation (TFBS) than on sequence conservation.
70? Phylogenetically conserved frameworks Why did you do this? What does it tell me?Then, if you do a framework analysis on two highly homologous sequences we run into a combinatorial explosion.FrameWorker checks for it and might give you a warning. However, in this case everything is fine...
71… Phylogenetically conserved frameworks Why did you do this? What does it tell me?If you do a framework analysis on two highly homologous sequences we run into a combinatorial explosion.FrameWorker checks for it and might give you a warning. However, in this case everything is fine...Now,we finally go to the FrameWorker analysis!
72Phylogenetically conserved frameworks This filter is a positive filter!Only TFs known to be associated with a tissue are listed here.A TF not listed in a certain tissue does NOT mean that it is not expressed there!It just has not been reported, yet.Here you can select for TFs only, known to be associated with certain tissues.Here you can choose the matrix library
73Phylogenetically conserved frameworks More options gives you...Don´t change those parameters unless you know exactly what you are doing !...well, more options !
74Phylogenetically conserved frameworks If you know that a certain TF is involved in the regulation of your gene, make it a mandatory element and search only for frameworks containing such. Mandatory elements are most helpful in focusing your analysis. If you don´t know one a priory, I´ll show you later in BiblioSpherePE how to get to those.Toggle multiple choices by holding the "Ctrl" key when clicking!This decides the number of input sequences which have to show a common pattern of TF sitesThis sets the distance constraints between two adjacent TF sites. More important than the absolute distance is the distance variance. Always start at default values (unless you know already better) and relax gradually if nothing meaningful is found.One word on this parameter. It decides the minimum/maximum number of TF sites being allowed in one framework. In this case I increased the default value from 6 up to 10 since we want to identify the largest conserved pattern in this phylogenetic promoter set.We might lower this later.And always think about the HELP pages !This option gives you an idea of the specificity of the found frameworks. It checks how often a framework would be found in a background of random human promoter sequences.Use it with care!It slows downFrameWorkerconsiderably!
75Phylogenetically conserved frameworks The longest frameworks contain 8 TF sites. There are 4 different frameworks. If you click the link, you jump direct to the graphical representationAll four promoters have 18 TF sites in common. This number might differ from the „search for common TF“ job earlier, since now we take strand specificity into account.
76Phylogenetically conserved frameworks Here you see the detailed description of the framework. It is perfectly conserved throughout the speciesYou can save this framework in your personal directory for subsequent sequence or database scansHere you have a graphical representation. You already know how this works...Scroll downtothe bottom ofthe page...
77? Phylogenetically conserved frameworks Why should I do this? At the bottom of the output you find this list. Now we not only have identified the TFs but also the exact positions which are worth a closer look. You can scan with your saved frameworks all of our promoter databases for promoters with similar organization.?Why should I do this?Would this give me additional information ?
78? Phylogenetically conserved frameworks Why should I do this? In this example with an 8 element framework and almost no distance variation between the TF sites, you will find exactly 1 match in over human promoters: the input gene.How to use this approach with less selective frameworks for identification of similarly organized promoters?I'll show you later…Why should I do this?Would this give me additional information ??
79? Knowledge based analysis Fine! Yes. The third is knowledge driven and bases on a combination of literature data mining, sequence analysis and pathway/network analysis. For this you need first to download and install the Java client of BiblioSpherePE?Fine!I think I have seen now two strategies. You mentioned three?
81Knowledge based analysis For more detailed introduction to BiblioSpherePE please have a look at
82Choose "single gene" here... Knowledge based analysis...un-tick this box...We are interested in the full network around our gene, not only the connected transcription factorsChoose "single gene" here...HMGCS1
84Knowledge based analysis This sets the context sensitive filter stringency.The most stringent including computer based semantic analysis is an orderedGene1 – function word – Gene2level (B3).(B4) shows expert curated gene-gene relationships only. Expert knowledge is derived by different sources, like Genomatix experts, Molecular Connection´s NetPro data base, STKE, etc...Click around,and see what happens !Here you have a list of all other genes, being connected to your input gene by at least one co-citation in entire PubMed on abstract level
85Knowledge based analysis I have intentionally chosen an example with no expert curation available, since I want to demonstrate how to generate new knowledge!Thisfilters the co-citation frequency
86Knowledge based analysis Here you see the network around HGMCS1, all other genes connected on GFG level
87Knowledge based analysis Here connected transcription factors only on GFG level.
88Knowledge based analysis Now all connected transcription factors.
89Knowledge based analysis A connection line between two genes means that there is a bibliographic connection on abstract level (BO)...
90"Mouse over" and clicking gives you more information... Knowledge based analysis"Mouse over" and clicking gives you more information...
91Knowledge based analysis The green indicates that there is a binding site for SREBF1 (V$SREB) in at least one of the promoters of HMGCS1
92There is more encoded in the connection lines... Knowledge based analysisThere is more encoded in the connection lines...
93Knowledge based analysis The little symbols give you some information about the gene and its association with pathways
94Some more helpful options from this page... Knowledge based analysisThe tagged text tells us that the TF SREBF1 is involved in regulation of HMGCS1Some more helpful options from this page...
95You can get all info about any gene you click up there... Knowledge based analysisThis you know already...You can get all info about any gene you click up there...over here...
97… Knowledge based analysis Hey, hey hey ! Stop it ! ..as well as this.…Hey, hey hey ! Stop it !I want to know about the regulation of my gene, not to play around with your Biblio...thing!
98… Knowledge based analysis Hey, hey hey ! Stop it ! I want to know about the regulation of my gene, not to play around with your Biblio...thing!BiblioSphere PathwayEdition !We already found TFs of interest, known to be involved in regulation of our gene.Now let´s see the biological environment of our gene and find a group of related genes which might share some regulatory motifs.Let´s go back and display all genes contained in this network...
99Knowledge based analysis Let´s load the GO-Filter"biological process"...
100Go to the table view by this tab... Knowledge based analysisGo to the table view by this tab...Here you see the tree for the selected filter. Expand and collapse by clicking on the +/-
101Knowledge based analysis The Z-Score gives you a measure whether certain categories are significantly over- or under-represented by the displayed gene set.Top scoring is sterol and cholesterol metabolism...Everything above 3 is statistically significant!Clicking here opens the tree on the left and highlights the category as well as the resp. genes in the pathway view.
102Knowledge based analysis This finally applies the filter to your gene set.Superimpose as many filters as you´d like !
103Knowledge based analysis We see two TFs in here,SREBF1 andSREBF2,both Sterol Regulatory ElementBinding Protein factors.The"redraw"buttonDouble-click on SREBF1in order to see all connectionsto that TF
105Knowledge based analysis ...the colors encode for...Highlight those genes with your mouse,andcopy them...
106Knowledge based analysis Now we have expanded our single input gene with a set of seven additional genes! And we know already quite a lot about them!They all are connected with my original gene in PubMedAll genes, with very high high statistical significance,belong to the GO-category "Cholesterol Metabolic Process"SREB transcription factors seem to play a rolein the regulation of those genesNow lets check whether the promoters of thosegenes share a complex framework.For such we first need to export those genesinto GenomatixSuite´s Gene2Promoter
107? Back to sequence level Oh my god... more... Where do I find this now ?Relax ! It´s easy and not far away...
108Back to sequence levelAPOA1, LDLR, SREBF2, VLDLR, FDFT1,FDPS, MVK, HMGCS1Paste here the gene symbols which we just copied inBiblioSpherePEDon´t forget this !Otherwise you will be asked for all findings in all organisms.
110? Back to sequence level Hey stop ! Haven´t I seen this before ? You are right! It pretty much is the same display as the comparative genomics page which we have generated earlier.The difference in this case is that we now compare promoters of different genes within one organism…
111? Back to sequence level 9.216 combinations possible Eight lociwith 26 differentunique promoters !9.216 combinations possiblefor exhaustive analysis!Combinatorial explosion !?How should I know which ones?How do I do this ?Since we are concentrating on SREB TF-sites, let´s concentrate on those promoters which contain an V$SREB binding site.We have to find a way to circumvent thisVery easy!Just scroll down to the bottom of the page...
112Back to sequence levelSelect the desired TF-matrix family here
113Back to sequence level...and all relevant promoters are checked already for youNow we have reduced to 12 different promoters from 8 different loci, each containing at least one SREB site.
114Back to sequence levelScroll to the bottom of the Gene2Promoter result page...We have done this before...
115Back to sequence level You see? Now we have tolerable combinatorics and can perform an exhaustive promoter analysis.
116Back to sequence level...but now we choose V$SREB as a mandatory element for our framework.Hint:you can select multiple elements by holding the "Ctrl" key while clicking....and with these parameters you have to play around a little bit. Start at default.Gradually relax stringency.Go down in Quorum Constraint step by step,or allow for higher distance variance(e.g. 20, 30, 40, 50, usw...)The lower the distance variance andthe more elements per model, the higher is the resulting model selectivity.Remember? We have been here before, too...
117Back to sequence levelTick the boxes of the models for subsequent database search for other promoters with similar organization.With 6 elements I expect to find the 3 genes from which this models were derived only: SREBF2, HMGCS1, and MVKThere are frameworks with 6 elements! This is quite significant and expected to be extremely selective.For example,at quorum of 30%,allowed distance range of 5 to 200 bp,distance variance of 50 bpmaximum elements allowed: 10we find quite a lot of frameworks in the different promoter combinations.
118Back to sequence level Now lets see how selective this model is... Scroll all the way down...This list is quite interesting!Here we have the differents TF sites in this set of frameworks.This list represents those TFs which we should concentrate on, when analyzing the regulation of the original input gene.It is pretty comparable to the list from our phylogenetic approach before.There is now good evidence that those factors play a role in regulation in the biological context of cholesterol metabolism.
119Back to sequence levelIt is just one click away...
120Back to sequence level This should look familiar to you ! But now we are going for the database section...Unless you have a good reason to do so, always go for the database of promoters of annotated genes. This allows for GO-group Z-scoring of the database hits later on...
121Back to sequence level This is a termination parameter. If this number of hits is reached before the end of the database, the search is terminatedCareful!Some browsers crash with too many hits to display in HTML !(>10.000)A database search usually takes several minutes. In order to avoid a server time-out go for the option. You´ll receive a mail with a direct link to your result file( it will be kept in your "Results Directory", too)
122Back to sequence level Eight matches! In four sequences. Each model matches exactly once per sequence...The three genes of our "training set"......out of a total of different promotersWow !
123Back to sequence level ...plus one additional "new" gene! This one was not in our input list and is identified only bycommon promoter organization!
124Back to sequence levelThose four genes now are extremely likely to share common regulation in the given biological context!The TFs in the framework now are the top candidates for further inspection.
125now there are too many slides !! Back to sequence levelThose four genes now are extremely likely to share common regulation in the given biological context!The TFs in the framework now are the top candidates for further inspection.…STOP !!First I had too many matches inMatInspector,now there are too many slides !!
126New Knowledge I am terribly sorry for that! However, eukaryotic transcriptional regulation is pretty complex.Our group of researchers works in this field since more than two decades.As you have seen, our tools - though pretty easy to use - require some explanations and sometimes a slightly different mind-setting, going beyond looking at single, isolated TF binding sites.I hope I was able to show you some basic strategies to follow.Nevertheless, lets have a final look at the additional gene which we have found with the database search in our example...
129methylmalonyl aciduria New KnowledgeMMAB is a transferase involved in vitamin B(12) activation and linked to a disease:methylmalonyl aciduria
130shows that they are all connected New KnowledgeFeeding all 4 genes from ModelInspector into BiblioSpherePEshows that they are all connectedplus...
131In our example, we started with a single gene ( HMGCS1), ElDoradoput it into biological context in and concentrated on an potential regulator ( SREB),BiblioSpherePEidentified common promoter organization (TF-Framework)GEMS Launcher , FrameWorkersearched for additional genes with similar promoter organization andGEMS Launcher , ModelInspectorput the genes back into biological context.Literature confirmed that we indeed found a co-regulated network and identified the molecular basis for such.This could NEVER be achieved by statistical analysis of isolated TFBS
132.There is so much more in GenomatixSuite PEI did neither say a word to matrix generation, nor to direct experimental planning for knock-out/knock-in experiments with SequenceShaper Expand the hit-list by shortening the framework, etc... etc...Get in touch with us viaand we will give you a tour through the entire system at a web-meeting.Some informative links: