Presentation is loading. Please wait.

Presentation is loading. Please wait.

Converting DNA Sequence file formats with BioPython

Similar presentations


Presentation on theme: "Converting DNA Sequence file formats with BioPython"— Presentation transcript:

1 Converting DNA Sequence file formats with BioPython
Tolulope Perrin-Stowe Program in Ecology, Evolution, and Conservation Biology Why convert sequence files? How to create a file conversion code There are many different bioinformatics programs that can be used to run analyzes on DNA sequences. Several are typically used in one publication. These programs can require a variety of different input file formats. The three file formats most often used in my research are Fasta, PHYLIP, and NEXUS files. Converting between these file formats often requires several additional programs when working in Windows. This can make keeping track of the many file outputs with different formats difficult. This code was written in BioPython, which uses Python coding to create tools for computational molecular biology. BioPython is freely available along with tutorials and packages that were used to write this code. The code takes either a Fasta file or a Genbank file (common sequence format file types) as an input and can convert the Fasta file into either a NEXUS or a PHYLIP file after using the program MUSCLE to first align the sequences. The Genbank file can be directly converted into a Fasta file. Fasta file Genbank file Code aligns sequence file using MUSCLE NEXUS file PHYLIP file Input Output or Sequence file conversion code Acknowledgements Citations I would like to thank Halie Rando, Kelsey Witt, Diana Byrne, Alya Stein, and Heidi Imker for their help in completion of this project and for their work in the course. Cock, P., Antao ,T., Chang, J. T., Chapman, B.A. , Cox, C.J. , Dalke, A., Friedberg, I., Hamelryck, T, Kauff, F. , Wilczynski, B. , and de Hoon, M. J. L. (2009). Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 25 (11): doi: /bioinformatics/btp163 Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl. Acids Res. 32 (5): doi: /nar/gkh340 This work was part of a Focal Point grant funded by the Graduate College at the University of Illinois at Urbana-Champaign


Download ppt "Converting DNA Sequence file formats with BioPython"

Similar presentations


Ads by Google