Presentation is loading. Please wait.

Presentation is loading. Please wait.

HomologyIf twp proteins are homologous, they have a common fold and a common ancestor If two proteins have >25% identity across their entire length, they.

Similar presentations


Presentation on theme: "HomologyIf twp proteins are homologous, they have a common fold and a common ancestor If two proteins have >25% identity across their entire length, they."— Presentation transcript:

1 HomologyIf twp proteins are homologous, they have a common fold and a common ancestor If two proteins have >25% identity across their entire length, they are likely to be Homologs. However, sometimes true homologs have quite low sequence identity! OrthologsHomologous (and equivalent) proteins from different species. Arise from speciation. ParalogsHomologous (and equivalent) proteins found in same species. Divergence of sequences NOT from speciation. AlignmentsHow to score? Minimum # of mutations?, Physicochemical properties (as perceived by us)?, Or learn from nature? Scoring schemesPAM, BLOSUM Gap penalties, low sequence complexity filtering

2 E valuesWhat it means in words E = Kmne -λS Alignment algorithmsBLAST (Basic Local Alignment Search Tool) FASTA (Fast Alignment) Smith-Waterman Needleman-Wunsch Why use local alignment algorithm?

3 IF A is related to B and B is related to C, then A is related to C (use orthologs, paralogs, known homologs PSI BLAST 1 normal BLAST run then subsequent iterations modify the scoring matrix – starts “rewarding” conservation at key positions in the alignment 3D-PSSMDatabase of structures grouped into fold families (homologs) Careful 1D PSSM is created for each fold family. The 1D PSSM is augmented by info from structural alignment of all family members(3D-PSSM). Also, entire library is used to assign solvation potentials (how likely is a glutamate to be only 5% exposed? 10% exposed. Etc.) Query sequence Get 1D-PSSM from nr database. Do 2ndry structure prediction. Now compare the query sequence to each library entry in 3 ways: Use the library Entry’s 1D-PSSM(augmented), see how the query compares to the 3D-PSSM of The library entry. See how the library entry compares to the 1D-PSSM for the query.

4 Hidden Markhov Models All transitions from one geometric shape to another are governed by probabilities that have been calculated (learned) using sequence alignments of proteins that define the fold. This mathematical engine (small one depicted above) can generate other sequences that obey the “rules” that define the fold (of a given family). The engine may have to be run MANY times before it spews out a sequence identical to one actually used to define the fold. But it can be used in a reverse way to estimate the likelihood that a query sequence could have been generated by the engine. If it is very likely, the query sequence has the same fold!! What’s hidden? One answer is “The original model” – The FIRST ancestral protein (e.g. The original PH domain) that set the mold for all others.


Download ppt "HomologyIf twp proteins are homologous, they have a common fold and a common ancestor If two proteins have >25% identity across their entire length, they."

Similar presentations


Ads by Google