Trinity College Dublin, The University of Dublin A Brief Introduction to Scientific Programming with Python Karsten Hokamp, PhD TCD Bioinformatics Support.

Slides:



Advertisements
Similar presentations
Python and Web Programming
Advertisements

Guide To UNIX Using Linux Third Edition
Guide To UNIX Using Linux Third Edition
Guide To UNIX Using Linux Third Edition
The University of Adelaide Table Talk: Using tables in Word Peter Murdoch March 2014 PREPARING GOOD LOOKING DOCUMENTS.
Prof. R. Willingale Department of Physics and Astronomy 2nd Year C+R 2 nd Year C and R Workshop Part of module PA2930 – 2.5 credits Venue: Computer terminal.
Introduction to Python: Slides Referenced in Homework 0 CSE-391: Artificial Intelligence University of Pennsylvania Matt Huenerfauth January 2005.
PYTHON: LESSON 1 Catherine and Annie. WHAT IS PYTHON ANYWAY?  Python is a programming language.  But what’s a programming language?  It’s a language.
CS190/295 Programming in Python for Life Sciences: Lecture 1 Instructor: Xiaohui Xie University of California, Irvine.
Group practice in problem design and problem solving
Comparing Python and Visual Basic
Introduction to programming in MATLAB MATLAB can be thought of as an super-powerful graphing calculator Remember the TI-83 from calculus? With many more.
Game Programming © Wiley Publishing All Rights Reserved. The L Line The Express Line to Learning L Line L.
Lesson 4 Computer Software
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
Public Resources (II) – Analysis tools  Web-based analysis tools – easy to use, but often with less customization options.  Stand-alone analysis tools.
General Computer Science for Engineers CISC 106 Lecture 02 Dr. John Cavazos Computer and Information Sciences 09/03/2010.
BioPython Workshop Gershon Celniker Tel Aviv University.
Chapter Four UNIX File Processing. 2 Lesson A Extracting Information from Files.
Guide To UNIX Using Linux Fourth Edition
A Guide to Unix Using Linux Fourth Edition
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 2 Input, Processing, and Output.
Introduction to Programming Peggy Batchelor.
Winrunner Usage - Best Practices S.A.Christopher.
Computer Programming for Biologists Oct 30 th – Dec 11 th, 2014 Karsten Hokamp  Fill out.
Ch 101 Chapter 10 Introduction to Batch Files. Ch 102 Overview A batch file is a text file that contains an ordered series of commands.
1 CSC 221: Introduction to Programming Fall 2012 Functions & Modules  standard modules: math, random  Python documentation, help  user-defined functions,
Introduction to R Lecture 1: Getting Started Andrew Jaffe 8/30/10.
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,
Numerical Computation Lecture 2: Introduction to Matlab Programming United International College.
A Brief Introduction to R Programming Darren J. Fitzpatrick, PhD The Bioinformatics Support Team 27/08/2015.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
1 System Administration Introduction to Scripting, Perl Session 3 – Sat 10 Nov 2007 References:  chapter 1, The Unix Programming Environment, Kernighan.
Chapter 17 Creating a Database.
Guide to Programming with Python Chapter One Getting Started: The Game Over Program.
Python Programming Using Variables and input. Objectives We’re learning to build functions and to use inputs and outputs. Outcomes Build a function Use.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
A Genomics View of Unix. General Unix Tips To use the command line start X11 and type commands into the “xterm” window A few things about unix commands:
Computer Programming for Biologists Class 6 Nov 21 th, 2014 Karsten Hokamp
IDLE An IDE for Python bundled with the program release Click on IDLE (Python GUI) in the Start menu under the Python program group  Get the IDLE Python.
You Need an Interpreter!. Closing the GAP Thus far, we’ve been struggling to speak to computers in “their” language, maybe its time we spoke to them in.
ECET – Dynamic Programming with Python Spring 2013 Lecture L1 – Introduction to Python Page 1 Welcome! This is Professor Jai P. Agrawal. I will walk.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python Karsten Hokamp, PhD Genetics TCD, 03/11/2015.
GE3M25: Computer Programming for Biologists Python, Class 5
Trinity College Dublin, The University of Dublin Data download: bioinf.gen.tcd.ie/GE3M25/project Get.fastq.gz file associated with your student ID
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 4 Karsten Hokamp, PhD Genetics TCD, 01/12/2015.
9/2/2015BCHB Edwards Introduction to Python BCHB524 Lecture 1.
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 2 Karsten Hokamp, PhD Genetics TCD, 17/11/2015.
Xi Wang Yang Zhang. 1. Easy to learn 2. Clean and readable codes 3. A lot of useful packages, especially for web scraping and text mining 4. Growing popularity.
1. COMPUTERS AND PROGRAMS Rocky K. C. Chang September 6, 2015 (Adapted from John Zelle’s slides)
Development Environment
CST 1101 Problem Solving Using Computers
Fundamentals of Python: First Programs
Introduction to Python
EMBL-EBI, programmatically - take a REST from manual searching: Sequence analysis tools Web Production Team Anna Foix Joon Lee.
XINFO – Eclipse RDz Plugin
Introduction to Python
CS190/295 Programming in Python for Life Sciences: Lecture 1
Perl for Bioinformatics
Guide To UNIX Using Linux Third Edition
First Python Program Professor Hugh C. Lauer CS-1004 — Introduction to Programming for Non-Majors (Slides include materials from Python Programming: An.
CSCI N317 Computation for Scientific Applications Unit 1 – 1 MATLAB
Multiple sequence alignment & Phylogenetics Analysis
Introduction to Python
Input and Output Python3 Beginner #3.
Programming for Business Computing Introduction
Presentation transcript:

Trinity College Dublin, The University of Dublin A Brief Introduction to Scientific Programming with Python Karsten Hokamp, PhD TCD Bioinformatics Support Team TCD, 26/08/2015

Trinity College Dublin, The University of Dublin Overview Programming First Python script/program Why Python? Bioinformatics examples Additional resources Outlook

Trinity College Dublin, The University of Dublin What is programming and why bother?  Data processing  Automation  Combination of programs for analysis pipelines  More control and flexibility  Better understanding of how programs work

Trinity College Dublin, The University of Dublin Programming Concepts  Turn into a very meticulous problem solver  Break problems into small details  Keep it variable  Give very precise instructions

Trinity College Dublin, The University of Dublin Programming Concepts "human" recipe

Trinity College Dublin, The University of Dublin Programming Concepts "computerised" recipe

Trinity College Dublin, The University of Dublin Mac for Windows users The main differences:  cmd instead of ctrl (e.g. cmd-C for copying)  right-click mouse: ctrl-click  # character: alt-3  switch between applications: cmd-tab  Spotlight (top right) for finding files/programs  Apple symbol (top left) for logging out

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment open through Spotlight

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment Alternatively: open through Finder

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment interactive Python console

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment simple Python statement

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment user input output

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment try a few simple numeric operations try a few simple numeric operations user input output

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment repeat/combine previous commands by clicking into them and hitting return (use left/right arrows and delete to edit them) repeat/combine previous commands by clicking into them and hitting return (use left/right arrows and delete to edit them)

Trinity College Dublin, The University of Dublin IDLE: Integrated DeveLopment Environment Console vs Editor ConsoleEditor interactiverequires extra click for running great for trying out codeadditional IDLE functionality not suited for long scriptssuited for long scripts no saving of codeallows to save code

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts open a new file

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts write some code

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts run your code shortcut: F5

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts save file first

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts specify a file name

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts write more code IDLE provides help write more code IDLE provides help

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts save and run: cmd-S then F5 save and run: cmd-S then F5

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts make it personal

Trinity College Dublin, The University of Dublin IDLE: Writing Python Scripts keep going

Trinity College Dublin, The University of Dublin Python vs Perl the equivalent in Perl the equivalent in Perl

Trinity College Dublin, The University of Dublin Python vs Perl the equivalent in Perl the equivalent in Perl

Trinity College Dublin, The University of Dublin Python vs Perl fewer special characters indentation enforced more user-friendly functions Python Perl

Trinity College Dublin, The University of Dublin Why Python?  easy to learn  great for beginners  enforces clean coding  great for teachers  comes with IDE  avoids command-line usage  object-orientated  code reuse and recycling  very popular  many peers  BioPython  many bioinformatics modules

Trinity College Dublin, The University of Dublin Simple Bioinformatics Example built-in function 'len'

Trinity College Dublin, The University of Dublin Simple Bioinformatics Example built-in function 'set'

Trinity College Dublin, The University of Dublin Simple Bioinformatics Example built-in functions 'sorted' and 'set'

Trinity College Dublin, The University of Dublin Simple Bioinformatics Example string method 'count'

Trinity College Dublin, The University of Dublin Simple Bioinformatics Example string method 'upper'

Trinity College Dublin, The University of Dublin  Basic sequence manipulation  Fetch records from databases  Multiple sequence alignment (Clustal, Muscle)  Sequence similarity search (Blast)  Working with motifs: MEME, Jaspar, Transfac  Phylogenetics  Clustering  Visualisation

Trinity College Dublin, The University of Dublin  Parsing GenBank records: from Bio import SeqIO record = SeqIO.read("AE gb", "genbank") record.description  'Salmonella enterica subsp. enterica serovar Typhi Ty2, complete genome.' len(record.features)  9086

Trinity College Dublin, The University of Dublin  Parsing sequence records: from Bio import SeqIO for entry in SeqIO.parse("tlr4_protein.fa", "fasta") : print(entry.description) print(len(entry), 'bp') gi| |gb|AJR | TLR4 [Gallus gallus] 843 bp gi| |gb|ABH | toll-like receptor 4 [Bos taurus] 841 bp gi| |gb|AAF |AF177765_1 toll-like receptor 4 [Homo sapiens] 839 bp …

Trinity College Dublin, The University of Dublin  Graphics: Chromosomes colour-coded by GC content (Bioinformatics with Python Cookbook)

Trinity College Dublin, The University of Dublin  Graphics: Coloured phylogenetic tree from Ebola sequences (Bioinformatics with Python Cookbook)

Trinity College Dublin, The University of Dublin Additional Resources

Trinity College Dublin, The University of Dublin Visualisations with Matplotlib

Trinity College Dublin, The University of Dublin Examples

Trinity College Dublin, The University of Dublin Scikit-learn – Machine Learning in Python Machine Learning: PCA of Iris data set

Trinity College Dublin, The University of Dublin Python Help

Trinity College Dublin, The University of Dublin Online courses      

Trinity College Dublin, The University of Dublin Books

Trinity College Dublin, The University of Dublin Conclusions You have been briefly introduced to Python and IDLE. You have learnt about programming concepts. You have seen examples of what can be accomplished through Python. Topics of an extensive Python course: Coding in Python – variables, scope, functions… Bioinformatics with BioPython Automated biological data analysis – your interests!

Trinity College Dublin, The University of Dublin Thank You!

Trinity College Dublin, The University of Dublin Don't forget to log out!