Presentation is loading. Please wait.

Presentation is loading. Please wait.

12.1. 12.2 The BioPerl project is an international association of developers of open source Perl tools for bioinformatics, genomics and life science research.

Similar presentations


Presentation on theme: "12.1. 12.2 The BioPerl project is an international association of developers of open source Perl tools for bioinformatics, genomics and life science research."— Presentation transcript:

1 12.1

2 12.2 The BioPerl project is an international association of developers of open source Perl tools for bioinformatics, genomics and life science research. Things you can do with BioPerl: Read and write sequence files of different format, including: Fasta, GenBank, EMBL, SwissProt and more… Extract gene annotation from GenBank, EMBL, SwissProt files Read and analyse BLAST results. Read and process phylogenetic trees and multiple sequence alignments. Analysing SNP data. And more… BioPerl

3 12.3 BioPerl modules are called Bio::XXX You can use the BioPerl wiki: http://bio.perl.org/ with documentation and examples for how to use them – which is the best way to learn this. We recommend beginning with the "How-tos": http://www.bioperl.org/wiki/HOWTOs http://www.bioperl.org/wiki/HOWTOs To a more in depth inspection of BioPerl modules: http://doc.bioperl.org/releases/bioperl-1.5.2/ BioPerl

4 12.4 The Bio::SeqIO module allows input/output of sequences from/to files, in many formats: use Bio::SeqIO; $in = new Bio::SeqIO("-file" => " "EMBL"); $out = new Bio::SeqIO("-file" => ">seq2.fasta", "-format" => "Fasta"); while ( my $seqObj = $in->next_seq() ) { $out->write_seq($seqObj); } A list of all the sequence formats BioPerl can read is in: http://www.bioperl.org/wiki/HOWTO:SeqIO#Formats http://www.bioperl.org/wiki/HOWTO:SeqIO#Formats BioPerl: the SeqIO module

5 12.5 use Bio::SeqIO; $in = new Bio::SeqIO("-file" => "<seq.fasta", "-format" => "Fasta"); while ( my $seqObj = $in->next_seq() ) { print "ID:".$seqObj->id()."\n"; #1st word in header print "Desc:".$seqObj->desc()."\n"; #rest of header print "Length:".$seqObj->length()."\n";#seq length print "Sequence: ".$seqObj->seq()."\n"; #seq string } The Bio::SeqIO function “ next_seq ” returns an object of the Bio::Seq module. This module provides functions like id() (returns the first word in the header line before the first space), desc() (the rest of the header line), length() and seq() (return sequence length). You can read more about it in: http://www.bioperl.org/wiki/HOWTO:Beginners#The_Sequence_Object http://www.bioperl.org/wiki/HOWTO:Beginners#The_Sequence_Object BioPerl: the Seq module

6 12.6 The Bio::Seq can read and parse the adenovirus genome file for us: BioPerl: Parsing a GenBank file gene 1..1846 /gene="NDP" /note="ND" /db_xref="LocusID:4693" /db_xref="MIM:310600" CDS 409..810 /gene="NDP" /note="Norrie disease (norrin)" /codon_start=1 /product="Norrie disease protein" /protein_id="NP_000257.1" /db_xref="GI:4557789" /db_xref="LocusID:4693" /db_xref="MIM:310600" /translation="MRKHVLAASFSMLSLL SHPLYKCSSKMVLLARCEGHCSQAS PLVSFSTVLKQPFRSSCHCCRPQTS LTATYRYILSCHCEEC " primary tag: gene tag: gene value: NDP tag: note value: ND tag: db_xref value: LocusID:4693 value: MIM:310600 primary tag: CDS tag: gene value: NDP tag: note value: Norrie disease (norrin)......

7 12.7 The Bio::Seq can read the adenovirus genome file for us: use Bio::SeqIO; $in = new Bio::SeqIO("-file" => $inputfilename, "-format" => "GenBank"); my $seqObj = $in->next_seq(); foreach my $featObj ($seqObj->get_SeqFeatures) { print "primary tag: ", $featObj->primary_tag, "\n"; foreach my $tag ($featObj->get_all_tags) { print " tag: ", $tag, "\n"; foreach my $value ($featObj->get_tag_values($tag)) { print " value: ", $value, "\n"; } } } BioPerl: Parsing a GenBank file primary tag: gene tag: gene value: NDP tag: note value: ND tag: db_xref value: LocusID:4693 value: MIM:310600 primary tag: CDS

8 12.8 The Bio::DB::Genbank module allows us to download a specific record from the NCBI website: use Bio::DB::GenBank; $gb = new Bio::DB::GenBank; $seqObj = $gb->get_Seq_by_acc("J00522"); # or... request Fasta sequence $gb = new Bio::DB::GenBank("-format" => "Fasta"); BioPerl: downloading files from the web

9 12.9 First we need to have the BLAST results in a text file BioPerl can read. Here is one way to achieve this: BioPerl: reading BLAST output Text Download

10 12.10 BioPerl: reading BLAST output

11 12.11 BioPerl: reading BLAST output

12 12.12 The Bio::SearchIO module can read and parse BLAST output: use Bio::SearchIO; my $blast_report = new Bio::SearchIO ("-format" => "blast", "-file" => "mice.blast"); while (my $result = $blast_report-> next_result ) { print "Checking query ", $result-> query_name, "\n"; while (my $hit = $result-> next_hit ()) { print "Checking hit ", $hit->name(), "\n"; my $hsp = $hit-> next_hsp (); print $hsp-> hit->start ()... $hsp-> hit->end ()... } } (See the blast example in lesson 1) BioPerl: reading BLAST output


Download ppt "12.1. 12.2 The BioPerl project is an international association of developers of open source Perl tools for bioinformatics, genomics and life science research."

Similar presentations


Ads by Google