Presentation is loading. Please wait.

Presentation is loading. Please wait.

HMMER tutorial 羅偉軒 Account IP: 140.129.78.120 Account: binfo2005 Password: 2005binfo.

Similar presentations


Presentation on theme: "HMMER tutorial 羅偉軒 Account IP: 140.129.78.120 Account: binfo2005 Password: 2005binfo."— Presentation transcript:

1 HMMER tutorial 羅偉軒 g39208007@ym.edu.tw

2 Account IP: 140.129.78.120 Account: binfo2005 Password: 2005binfo

3 HMMER http://hmmer.wustl.edu/ The theory behind profile HMMs: R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological sequence analysis: probabilistic models of proteins and nucleic acids, Cambridge University Press, 1998. Biological sequence analysis: probabilistic models of proteins and nucleic acids

4 Flowchart http://bioweb.pasteur.fr/seqanal/motif/hmmer-uk.html

5 Format of input alignment files Output of CLUSTAL family of programs Wisconsin/GCG MSF format the input format for the PHYLIP phylogenetic analysis programs aligned FASTA format Stockholm format (HMMER’s native format, used by the Pfam and Rfam databases) SELEX format

6 Searching a sequence database with a single profile HMM build a profile HMM with hmmbuild > hmmbuild globin.hmm globins50.msf calibrate the profile HMM with hmmcalibrate > hmmcalibrate globin.hmm search the sequence database with hmmsearch > hmmsearch globin.hmm Artemia.fa

7

8

9 local alignment versus global alignment To HMMER, whether local or global alignments are allowed is part of the model, rather than being accomplished by running a different algorithm. you need to choose what kind of alignments you want to allow when you build the model with hmmbuild. By default, hmmbuild builds models which allow alignments that are global with respect to the HMM, local with respect to the sequence, and allows multiple domains to hit per sequence.

10 Searching a query sequence against a profile HMM database creating your own profile HMM database > hmmbuild -A myhmms rrm.sto > hmmbuild -A myhmms fn3.sto > hmmbuild -A myhmms pkinase.sto > hmmcalibrate myhmms parsing the domain structure of a sequence with hmmpfam > hmmpfam myhmms 7LES DROME

11 Creating and maintaining multiple alignments with hmmalign Another use of profile HMMs is to create multiple sequence alignments of large numbers of sequences. A profile HMM can be build of a “seed” alignment of a small number of representative sequences, and this profile HMM can be used to efficiently align any number of additional sequences. > hmmalign -o globins630.ali globin.hmm globins630.fa

12

13

14 HMMER scoring and determining significance HMMER gives you at least two scoring criteria to judge by: the HMMER raw score, and an E- value. The E-value is calculated from the bit score. It tells you how many false positives you would have expected to see at or above this bit score. HMMER bit scores reflect whether the sequence is a better match to the profile model (positive score) or to the null model of nonhomologous sequences (negative score).

15 hmmsearch output

16

17

18 Building a model –hmmbuild From a multiple sequence alignmenthmmbuild Using a model –hmmalign Align sequences to an existing model (outputs a multiple alignment)hmmalign –hmmconvert Convert a model into different formatshmmconvert –hmmcalibrate Takes an HMM and empirically determines parameters that are used to make searches more sensitive, by calculating more accurate expectation value scores (E-values)hmmcalibrate –hmmemit Emit sequences probabilistically from a profile HMMhmmemit –hmmsearch Search a sequence database for matches to an HMMhmmsearch HMMs Databases –hmmfetch Get a single model from an HMM databasehmmfetch –hmmindex: Index an HMM database (not available on the WEB server) –hmmpfam Search an HMM database for matches to a query sequencehmmpfam Other programs –alistat: Show some simple statistics about a sequence alignment filealistat –seqstat: Show some simple statistics about a sequence fileseqstat –getseq: Retrieve a (sub-)sequence from a sequence file (not available on the WEB server) –sreformat: Reformat a sequence(s) or alignment file into a different formatsreformat

19 References HMMER user guide Eddy SR. (1998) Profile hidden Markov models. Bioinformatics.Eddy SR

20

21

22

23

24

25

26 Related links HMMER http://hmmer.wustl.edu/ SAM http://www.cse.ucsc.edu/research/compbio/sam.html PFTOOLS http://www.isrec.isb-sib.ch/ftp-server/pftools/ HMMpro http://www.netid.com/html/hmmpro.html GENEWISE http://www.ebi.ac.uk/Wise2/ PROBE ftp://ftp.ncbi.nih.gov/pub/neuwald/probe1.0/ META-MEME http://metameme.sdsc.edu/ BLOCKS http://www.blocks.fhcrc.org/ PSI-BLAST http://www.ncbi.nlm.nih.gov/BLAST/newblast.html

27 Homework: Search for homologies with hidden Markov models Obtain the UniProtKB/Swiss-Prot entry P10242 of the myb proto-oncogene protein (AC P10242, entry MYB_HUMAN) Take the amino acid sequence of the myb protein and search against the NCBI nr protein database with BLASTp to obtain a HMM for myb-domains and use this HMM for searching against the UniProt-SwissProt protein database. Select 10 myb-domains while screening the hits of the BLASTp search and copy the corresponding parts of the sequences to a file in fasta-format Do a multiple sequence alignment with these ten myb- domains by ClustalW.

28 Homework: Search for homologies with hidden Markov models (cont.) Download HMMER from http://hmmer.wustl.edu/ and install. Build and calibrate a HMM of these myb- domains by means of hmmbuild and hmmcalibrate. Use hmmsearch to search against the UniProt- SwissProt protein library with the HMM of the myb-domains. Screen the hits, build a new HMM including selected hits and hmmsearch again. How many hits do you get? What are they?

29 HMM

30 Some examples


Download ppt "HMMER tutorial 羅偉軒 Account IP: 140.129.78.120 Account: binfo2005 Password: 2005binfo."

Similar presentations


Ads by Google