Python is Awesome! (and cooler than R). My Research.

Slides:



Advertisements
Similar presentations
Microsoft Excel 2002 Microsoft Excel is a powerful spreadsheet program that helps you to organize data complete calculations make decisions graph data.
Advertisements

An Introduction to R: Logic & Basics. The R language Command line Can be executed within a terminal Within Emacs using ESS (Emacs Speaks Statistics)
Data Search and Retrieval
Matrix Manipulation and 2D Plotting
Matrices A set of elements organized in a table (along rows and columns) Wikipedia image.
MARCUS LYON Bioinformatics Workshop Why I’m Here Gain a better understanding of bioinformatics  Benefit my current/future research  Useful information.
Visualizing Multiple Physician Office Locations Exercise 9 GIS in Planning and Public Health Wansoo Im, Ph.D.
Bio 465 Summary. Overview Conserved DNA Conserved DNA Drug Targets, TreeSAAP Drug Targets, TreeSAAP Next Generation Sequencing Next Generation Sequencing.
MICB 405 Bioinformatics Mini-Lab #2 - BLAST Dr. Joanne Fox We gratefully acknowledge the funding for the development of these teaching.
Using a Genetic Algorithm for Approximate String Matching on Genetic Code Carrie Mantsch December 5, 2003.
A.How to create string controls and indicators B.Some string functions C.How to perform file input and output operations D.How to format text files for.
High Throughput Data Analysis Karin Leiderman ViaLogy Southern California Bioinformatics Summer Institute at California State University, Los Angeles.
Working with Pathogen Genomes
Biological Sequence Analysis BNFO 691/602 Spring 2014 Mark Reimers
NGS data format and General Quality Control. Data format “Flowchart” Sequencer raw data FastqSAM/BAM.
Public Resources (II) – Analysis tools  Web-based analysis tools – easy to use, but often with less customization options.  Stand-alone analysis tools.
Python programs How can I run a program? Input and output.
Python Mini-Course University of Oklahoma Department of Psychology Day 1 – Lesson 2 Fundamentals of Programming Languages 4/5/09 Python Mini-Course: Day.
Using the Unix Shell There is No ‘Undelete’. The Unix Shell “A Unix shell is a command-line interpreter or shell that provides a traditional user interface.
Making a Pie Chart In Microsoft Excel For PowerPoint WHAT MY DAY IS LIKE.
BioPython Workshop Gershon Celniker Tel Aviv University.
MCB 5472 Assignment #6: HMMER and using perl to perform repetitive tasks February 26, 2014.
Computer Science 320 Broadcasting. Floyd’s Algorithm on SMP for i = 0 to n – 1 parallel for r = 0 to n – 1 for c = 0 to n – 1 d rc = min(d rc, d ri +
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
Next Generation DNA Sequencing
Adding GO GO Workshop 3-6 August GOanna results and GOanna2ga 2. gene association files 3. getting GO for your dataset 4. adding more GO (introduction)
Identifying the ortholog of TNF (Tumor necrosis factor) in mosquito genomes Pet Projects:
Melanie Peasnall. MS Word Microsoft Word is a word processing program that allows you to make text files. This can be anything from a note to a novel.
Clean up sequences with multiple >GI numbers when downloaded from NCBI BLAST website [ Example of one sequence and the duplication clean up for phylo tree.
Lesson 3-1 Example Example 2 Find the sum of 311 and 452 using expanded form. 1.Write the first number in expanded form
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
NCBI Genome Workbench Chuong Huynh NIH/NLM/NCBI Sao Paulo, Brasil July 15, 2004 Slides from Michael Dicuccio’s Genome Workbench.
OCR Computing GCSE © Hodder Education 2013 Slide 1 OCR GCSE Computing Python programming 4: Writing programs.
Parsing BLAST output. Output of a local BLAST search “less” program Full path to the BLAST output file.
(PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt.
Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015
GEO375 Final Project: From Txt to Geocoded Data. Goal My Final project is to automate the process of separating, geocoding and processing 911 data for.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
R ESEARCH U PDATE ON J ULY 17 TH – A MY (1) Found tools for RNA tertiary structure prediction From secondary structure to tertiary structure o NAST(2009):
PROTEIN IDENTIFIER IAN ROBERTS JOSEPH INFANTI NICOLE FERRARO.
DNA / protein sequence analysis 第九組成員: 吳宇軒 侯卜夫 朱子豪 王俊偉
Canadian Bioinformatics Workshops
Computing challenges in working with genomics-scale data
Using command line tools to process sequencing data
WS9: RNA-Seq Analysis with Galaxy (non-model organism )
Creates the file on disk and opens it for writing
Making a JSON file.
(optional - but then again, all of these are optional)
(optional - but then again, all of these are optional)‏
Workshop on Microbiome and Health
Lab 01 - Grades.
Python I/O.
Introduction to Python
Converting DNA Sequence file formats with BioPython
Python Stateful Parsing
Creates the file on disk and opens it for writing
Next Gen. Sequencing Files and pysam
Data Structures – 2D Lists
Presentation Subtitle
Explore Evolution: Instrument for Analysis
Moral Compass Questions Rights and Responsibilities
Pairwise Sequence Alignment
Next Gen. Sequencing Files and pysam
Multiple sequence alignment & Phylogenetics Analysis
Next Gen. Sequencing Files and pysam
Starter Activities GCSE Python.
The Python interpreter
Quadratic Graphs – Plotting – Demonstration
Introduction to Computer Science
RNA-Seq Data Analysis UND Genomics Core.
Presentation transcript:

Python is Awesome! (and cooler than R)

My Research

What I Do Generate DNA Sequences Align Sequences to Reference Find Deletions Plot On Secondary Structure Measure 3D Distances Plot On Tertiary Structure

Generate DNA Sequences 320 Lines

Align Sequences to Reference Bowtie

Find Deletions 370 Lines

Three Dimensional Distances 150 Lines

Plot On Secondary Structure 650 Lines

Plot On 3D Structure 50 Lines

Text Manipulation

Bad PDB File

Rows and Columns

The Code Separates by Rows Separates by Columns Execution: python editPDBFormat.py myPDBFile.pdb

Good PDB File

FASTA Files

FASTQ Files

Alignment Files

BLAST Output

Moral Formatted = Code-able – NCBI – ENSEMBL – PDB Automate! – Don’t do anything more than 3 times