1 why to become a Pyologist Perl is for plumbers – Python is for biologists Stefan Maetschke Teasdale Group.

Slides:



Advertisements
Similar presentations
Guy Griffiths. General purpose interpreted programming language Widely used by scientists and programmers of all stripes Supported by many 3 rd -party.
Advertisements

Algorithms and applications
Python for Science Shane Grigsby. What is python? Why python? Interpreted, object oriented language Free and open source Focus is on readability Fast.
Programming Paradigms and languages
A Crash Course Python. Python? Isn’t that a snake? Yes, but it is also a...
Objected Oriented Perl An introduction – because I don’t have the time or patience for an in- depth OOP lecture series…
Computer Programming for Biologists Class 9 Dec 4 th, 2014 Karsten Hokamp
RCAC Research Computing Presents: DiaGird Overview Tuesday, September 24, 2013.
Script Languages in Science CCOM Student Seminar Series Kurt Schwehr 12-Nov-2008.
Python plotting for lab folk Only the stuff you need to know to make publishable figures of your data. For all else: ask Sourish.
Introduction to Programming (in C++) Introduction Jordi Cortadella, Ricard Gavaldà, Fernando Orejas Dept. of Computer Science, UPC.
Python Introduction.
Introduction to Computer Programming Computer Programming I Introduction to Aerospace Created by The North Carolina School of Science and Math.The North.
Python.
COSC 1306—COMPUTER SCIENCE AND PROGRAMMING DATA ABSTRACTION Jehan-François Pâris
Programming 101 with Python: an open-source, cross-platform, and fun language By J. Burton Browning, Ed.D. Copyright © J. Burton Browning All rights reserved.
Groovy WHAT IS IT? HOW DOES IT WORK? IS IT USEFUL?
Chapter 1. Introduction.
Trinity College Dublin, The University of Dublin A Brief Introduction to Scientific Programming with Python Karsten Hokamp, PhD TCD Bioinformatics Support.
Prof. Alfred J Bird, Ph.D., NBCT Office – McCormack 3rd floor 607.
Programming Languages: Scratch Intro to Scratch. Lower level versus high level Clearly, lower level languages can be tedious Higher level languages quickly.
An Introduction to Python Programming Dr. Mark Goadrich Centenary College of Louisiana NWLAPCUG - May 15th 2008.
Python – Part 1 Python Programming Language 1. What is Python? High-level language Interpreted – easy to test and use interactively Object-oriented Open-source.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
A Brief Introduction to R Programming Darren J. Fitzpatrick, PhD The Bioinformatics Support Team 27/08/2015.
Hands-on Introduction to R. We live in oceans of data. Computers are essential to record and help analyse it. Competent scientists speak C/C++, Java,
Bioinformatics Introduction to Perl. Introduction What is Perl Basic concepts in Perl syntax: – variables, strings, – Use of strict (explicit variables)
Python Mini-Course University of Oklahoma Department of Psychology Day 2 – Lesson 5 Function Interfaces 4/18/09 Python Mini-Course: Day 2 - Lesson 5 1.
Geog Basic Skills in Scientific Programming Syllabus, Introduction, Fundamentals of IDL Syntax.
7 1 User-Defined Functions CGI/Perl Programming By Diane Zak.
Bioinformatics Curriculum Issues, goals, curriculum.
Topic 4:Subroutines CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 4, pages 56-72, Programming Perl 3rd edition pages 80-83,
Department of Electrical and Computer Engineering Introduction to Perl By Hector M Lugo-Cordero August 26, 2008.
CIS 601 Fall 2003 Introduction to MATLAB Longin Jan Latecki Based on the lectures of Rolf Lakaemper and David Young.
Python  Monty or Snake?. Monty?  Spam, spam, spam and eggs  Dead parrots  Eric Idle, John Cleese, Michael Palin, etc.
1 PERL Functions. 2 Functions Functions also called subroutines are “free flowing”. The returned value from a function can be interpreted in many different.
COMP 4332 Tutorial 1 Feb 16 WANG YUE Tutorial Overview & Learning Python.
Python & NetworkX Youn-Hee Han
 History  Ease of use  Portability  Standard  Security & Privacy  User support  Application &Popularity Today  Ten Most Popular Programming Languages.
CS 368 – Intro to Scripting Languages Summer 2009 Cartwright, De Smet, LeRoy Object Oriented Programming Programming Perl Chapter 12: "Objects"
Scripting Languages Info derived largely from Programming Language Pragmatics, by Michael Scott.
Dept. of Animal Breeding and Genetics Programming basics & introduction to PERL Mats Pettersson.
Bioinformatics Introduction to Perl. Introduction What is Perl Basic concepts in Perl syntax: – variables, strings, – Use of strict (explicit variables)
Programming Language Theory 2014, 1 Chapter 1 :: Introduction Origin : Michael L. Scott School of Computer & Information Engineering,
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
PROBLEM SOLVING WARM-UP Fill in the spaces using any operation to solve the following (!, (), -/+,÷,×): = 6.
Python Scripting for Computational Science CPS 5401 Fall 2014 Shirley Moore, Instructor October 6,
Chapter 1. Introduction.
Basic Concepts: computer, program, programming …
CSC391/691 Intro to OpenCV Dr. Rongzhong Li Fall 2016
Basic 1960s It was designed to emphasize ease of use. Became widespread on microcomputers It is relatively simple. Will make it easier for people with.
Scripting Languages Info derived largely from Programming Language Pragmatics, by Michael Scott.
Basic 1964 PC general purpose Imperative Small Easy to use.
Erich Smith Coleman Platt
Natural Language Processing (NLP)
Special types Objects and operators built into the language but used only in modules: Ellipsis (also “…”): used chiefly in slices in modules like numpy.
Choice of Programming Language
Prepared by Kimberly Sayre and Jinbo Bi
TRANSLATORS AND IDEs Key Revision Points.
Introduction to Computers and Python
Introduction to MATLAB
Python for Scientific Computing
Programming Language Design
Overview of Programming Paradigms
Simulation And Modeling
Natural Language Processing (NLP)
School of Computer & Information Engineering,
General Computer Science for Engineers CISC 106 Lecture 03
symbolic math toolbox matlab
Natural Language Processing (NLP)
Presentation transcript:

1 why to become a Pyologist Perl is for plumbers – Python is for biologists Stefan Maetschke Teasdale Group

2 why   Biologists suffer for no good reason Perl is difficult to write and read Perl gives weak error feedback Perl obscures basic concepts   Limited understanding of principles   Low productivity   Reduced research scope   Perl is for plumbers - Python is for scientists   I want to have an easy life why, why, why …

3 plumbers and others sys admin  plumbing  vi  awk/Perl  grep/diff SW developer   designing   Emacs/IDE   C/C++/Java   UML/Unit test spectrum of tasks, tools and roles scientist Python

4 equals(, ) Cross-platform, open-source, scripting language, multi-paradigm, dynamic typing, statement ratio: 6 There should be one wayThere’s more than one way Guido van RossumLarry Wall Python Perl EasyDifficult

5 you must be joking! = ('a', 'b', 'c'); my %hash; $hash{‘letters'} = print list = ['a', 'b', 'c'] hash = {} hash[‘letters'] = list print hash[‘letters'] package Person; use strict; sub new { my $class = shift; my $age = shift or die "Must pass age"; my $rSelf = {'age' => $age}; bless ($rSelf, $class); return $rSelf; } class Person: def __init__(self, age): self.age = = ( [‘a’, ’b’, ’c’], [1, 2, 3] ); print list = [ [‘a’, ’b’, ’c’], [1, 2, 3] ] print list[0]

6 More Perl bashing… sub add { $_[0] + $_[1]; } def add(a, b): return a + b sub add { my ($a, $b) = return $a + $b; } sub add { my $a = shift; my $b = shift; return $a + $b; } def diff(a, b): return len(a) - len(b) sub diff { my ($aref, $bref) = my my return + sub add($, $) { local ($a, $b) = return $a + $b; }

7 complexity wall simple scripts ≈ 100 lines => fun stops Higher order concepts Data structures Functions Classes => Python allows you to break through the complexity wall everything you can do in Python you can do in Perl but you don’t

8 googliness  C53,0001,  Java7,7602,  C++ 1,290 3,  C#1,  Perl1,  Python  Ruby  Scala  Haskell X languageX load file kilo-hits, May 2008 X bioinformatics

9 and the winner is… <- without Psyco

10 damn lies and stats sourceforge projects   Perl declining, Python increasing ?   May 2008, keyword search : Perl 3474, Python 4063

11 see the light… classify Iris plants Fisher, R.A. "The use of multiple measurements in taxonomic problems" Annual Eugenics, 7, Part II, (1936) Three species: Iris setosa Iris versicolor Iris virginica Four attributes: sepal length sepal width petal length petal width

12 Iris – convert data

13 Iris – correlation

14 Iris – do stats

15 Iris – linear regression

16 Iris – plot data

17 libs for life science  Scientific computing: SciPy, NumPy, matplotlib  Bioinformatics: BioPython  Phylogenetic trees: Mavric, Plone, P4, Newick  Microarrays: SciGraph, CompClust  Molecular modeling: MMTK, OpenBabel, CDK, RDKit, cinfony, mmLib  Dynamic systems modeling: PyDSTools  Protein structure visualization: PyMol, UCSF Chimera  Networks/Graphs: NetworkX, PyGraphViz  Symbolic math: SymPy, Sage  Wrapper for C/C++ code: SWIG, Pyrex, Cython  R/SPlus interface: RSPython, RPy  Java interface: Jython  Fortran to Python: F2PY  … Check also out: and:

18 last words  Perl perfect for plumbing  Python excellent for scientific programming Easy to learn, write and maintain Suited for scripting and mid-size projects Huge number of scientific libraries  Python is an attractive alternative to Matlab/R  Easy integration of Java, C/C++ or Fortran code

19 questions Interest: Python Course? isn’t Python lovely…

20

21 links  Wikipedia – Python  Instant Python  How to think like a computer scientist  Dive into Python  Python course in bioinformatics  Beginning Python for bioinformatics  SciPy Cookbook Matplotlib Cookbook  Biopython tutorial and cookbook  Huge collection of Python tutorial  What’s wrong with Perl  20 Stages of Perl to Python conversion  Why Python

22 some papers  Bassi S. (2007) A Primer on Python for Life Science Researchers. PLoS Comput Biol 3(11): e199. doi: /journal.pcbi  Mangalam H. (2002) The Bio* toolkits--a brief overview. Brief Bioinform. 3(3):  Fourment M., Gillings MR. (2008) A comparison of common programming languages used in bioinformatics. BMC Bioinformatics 9:82.

23 to whom it may concern  NPs who don’t use Perl yet  NPs who want to see the light  NPs who want to give their code away without being rightfully ashamed  Matlab aficionados NP = Non-Programmer

24 one of ten Perl myths “…Perl works the way you do…” “…That's one, fairly natural way to think about it…” while (<>) { s/(.*):(.*)/$2:$1/; print; } Swap two sections of a string: “aaa:bbb” -> “bbb:aaa” for line in file: line = line.strip() first, second = line.split(‘:’) print second+’:’+first while (<>) { chomp; ($first, $second) = split /:/; print $second, ":", $first, "\n"; } “…we can happily consign the idea that ‘Perl is hard’ to mythology.” from re import sub for line in file: print sub(‘(.*):(.*)’, r’\2:\1’, line)

25 camel chaos  does not scale well  complex syntax  cryptic commands  does not encourage clear code  difficult to read/maintain  hard to understand the principles  error prone no check of subroutine arguments variables are global by default …

26 why Python  overcome the complexity wall  many, excellent scientific libraries  clear, easy to learn syntax  hard to do it wrong  does not require prior suffering/experience

27 my bias  R&D: C/C++ -> applied ML in robotics, image processing, quality control  SW Development: Java -> Speech Processing, Data Mining  Computational Biology: Java, Python  Other languages I played with: Ada, APL, Basic, MatLab, Modula, Pascal, Perl, Prolog, R, Groovy, Forth, Fortran, Scala, Assembly code