Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 why to become a Pyologist Perl is for plumbers – Python is for biologists Stefan Maetschke Teasdale Group.

Similar presentations


Presentation on theme: "1 why to become a Pyologist Perl is for plumbers – Python is for biologists Stefan Maetschke Teasdale Group."— Presentation transcript:

1 1 why to become a Pyologist Perl is for plumbers – Python is for biologists Stefan Maetschke Teasdale Group

2 2 why   Biologists suffer for no good reason Perl is difficult to write and read Perl gives weak error feedback Perl obscures basic concepts   Limited understanding of principles   Low productivity   Reduced research scope   Perl is for plumbers - Python is for scientists   I want to have an easy life why, why, why …

3 3 plumbers and others sys admin  plumbing  vi  awk/Perl  grep/diff SW developer   designing   Emacs/IDE   C/C++/Java   UML/Unit test spectrum of tasks, tools and roles scientist Python

4 4 equals(, ) Cross-platform, open-source, scripting language, multi-paradigm, dynamic typing, statement ratio: 6 There should be one wayThere’s more than one way Guido van RossumLarry Wall 19911987 Python Perl EasyDifficult

5 5 you must be joking! http://www.strombergers.com/python/ my @list = ('a', 'b', 'c'); my %hash; $hash{‘letters'} = \@list; print "@{$hash{‘letters'}}\n"; list = ['a', 'b', 'c'] hash = {} hash[‘letters'] = list print hash[‘letters'] package Person; use strict; sub new { my $class = shift; my $age = shift or die "Must pass age"; my $rSelf = {'age' => $age}; bless ($rSelf, $class); return $rSelf; } class Person: def __init__(self, age): self.age = age @list = ( [‘a’, ’b’, ’c’], [1, 2, 3] ); print “@{$list[0]}\n”; list = [ [‘a’, ’b’, ’c’], [1, 2, 3] ] print list[0]

6 6 More Perl bashing… http://www.strombergers.com/python/ sub add { $_[0] + $_[1]; } def add(a, b): return a + b sub add { my ($a, $b) = _@; return $a + $b; } sub add { my $a = shift; my $b = shift; return $a + $b; } def diff(a, b): return len(a) - len(b) sub diff { my ($aref, $bref) = _@; my (@a) = @$aref; my (@b) = @$bref; return scalar(@a) + scalar(@b);} sub add($, $) { local ($a, $b) = _@; return $a + $b; }

7 7 complexity wall simple scripts ≈ 100 lines => fun stops Higher order concepts Data structures Functions Classes => Python allows you to break through the complexity wall everything you can do in Python you can do in Perl but you don’t

8 8 googliness  C53,0001,820572  Java7,7602,890320  C++ 1,290 3,100231  C#1,020 794161  Perl1,150 685101  Python527 798199  Ruby470 806186  Scala394 35469  Haskell212 32374 X languageX load file kilo-hits, May 2008 X bioinformatics

9 9 and the winner is… <- without Psyco http://shootout.alioth.debian.org/

10 10 damn lies and stats http://rengelink.textdriven.com/blog/ sourceforge projects   Perl declining, Python increasing ?   May 2008, keyword search : Perl 3474, Python 4063

11 11 see the light… classify Iris plants Fisher, R.A. "The use of multiple measurements in taxonomic problems" Annual Eugenics, 7, Part II, 179-188 (1936) http://archive.ics.uci.edu/ml/datasets/Iris Three species: Iris setosa Iris versicolor Iris virginica Four attributes: sepal length sepal width petal length petal width

12 12 Iris – convert data

13 13 Iris – correlation

14 14 Iris – do stats

15 15 Iris – linear regression

16 16 Iris – plot data

17 17 libs for life science  Scientific computing: SciPy, NumPy, matplotlib  Bioinformatics: BioPython  Phylogenetic trees: Mavric, Plone, P4, Newick  Microarrays: SciGraph, CompClust  Molecular modeling: MMTK, OpenBabel, CDK, RDKit, cinfony, mmLib  Dynamic systems modeling: PyDSTools  Protein structure visualization: PyMol, UCSF Chimera  Networks/Graphs: NetworkX, PyGraphViz  Symbolic math: SymPy, Sage  Wrapper for C/C++ code: SWIG, Pyrex, Cython  R/SPlus interface: RSPython, RPy  Java interface: Jython  Fortran to Python: F2PY  … Check also out:http://www.scipy.org/Topical_Software and: http://pypi.python.org/pypihttp://www.scipy.org/Topical_Softwarehttp://pypi.python.org/pypi

18 18 last words  Perl perfect for plumbing  Python excellent for scientific programming Easy to learn, write and maintain Suited for scripting and mid-size projects Huge number of scientific libraries  Python is an attractive alternative to Matlab/R  Easy integration of Java, C/C++ or Fortran code

19 19 questions Interest: Python Course? isn’t Python lovely…

20 20

21 21 links  Wikipedia – Python http://en.wikipedia.org/wiki/Python http://en.wikipedia.org/wiki/Python  Instant Python http://hetland.org/writing/instant-python.html http://hetland.org/writing/instant-python.html  How to think like a computer scientist http://openbookproject.net//thinkCSpy/ http://openbookproject.net//thinkCSpy/  Dive into Python http://www.diveintopython.org/ http://www.diveintopython.org/  Python course in bioinformatics http://www.pasteur.fr/recherche/unites/sis/formation/python/index.html http://www.pasteur.fr/recherche/unites/sis/formation/python/index.html  Beginning Python for bioinformatics http://www.onlamp.com/pub/a/python/2002/10/17/biopython.html http://www.onlamp.com/pub/a/python/2002/10/17/biopython.html  SciPy Cookbook http://www.scipy.org/Cookbook Matplotlib Cookbook http://www.scipy.org/Cookbook/Matplotlib http://www.scipy.org/Cookbook http://www.scipy.org/Cookbook/Matplotlib  Biopython tutorial and cookbook http://www.bioinformatics.org/bradstuff/bp/tut/Tutorial.html http://www.bioinformatics.org/bradstuff/bp/tut/Tutorial.html  Huge collection of Python tutorial http://www.awaretek.com/tutorials.html http://www.awaretek.com/tutorials.html  What’s wrong with Perl http://www.garshol.priv.no/download/text/perl.html http://www.garshol.priv.no/download/text/perl.html  20 Stages of Perl to Python conversion http://aspn.activestate.com/ASPN/Mail/Message/python-list/1323993 http://aspn.activestate.com/ASPN/Mail/Message/python-list/1323993  Why Python http://www.linuxjournal.com/article/3882 http://www.linuxjournal.com/article/3882

22 22 some papers  Bassi S. (2007) A Primer on Python for Life Science Researchers. PLoS Comput Biol 3(11): e199. doi:10.1371/journal.pcbi.0030199  Mangalam H. (2002) The Bio* toolkits--a brief overview. Brief Bioinform. 3(3):296-302.  Fourment M., Gillings MR. (2008) A comparison of common programming languages used in bioinformatics. BMC Bioinformatics 9:82.

23 23 to whom it may concern  NPs who don’t use Perl yet  NPs who want to see the light  NPs who want to give their code away without being rightfully ashamed  Matlab aficionados NP = Non-Programmer

24 24 one of ten Perl myths http://www.perl.com/pub/a/2000/01/10PerlMyths.html “…Perl works the way you do…” “…That's one, fairly natural way to think about it…” while (<>) { s/(.*):(.*)/$2:$1/; print; } Swap two sections of a string: “aaa:bbb” -> “bbb:aaa” for line in file: line = line.strip() first, second = line.split(‘:’) print second+’:’+first while (<>) { chomp; ($first, $second) = split /:/; print $second, ":", $first, "\n"; } “…we can happily consign the idea that ‘Perl is hard’ to mythology.” from re import sub for line in file: print sub(‘(.*):(.*)’, r’\2:\1’, line)

25 25 camel chaos  does not scale well  complex syntax  cryptic commands  does not encourage clear code  difficult to read/maintain  hard to understand the principles  error prone no check of subroutine arguments variables are global by default …

26 26 why Python  overcome the complexity wall  many, excellent scientific libraries  clear, easy to learn syntax  hard to do it wrong  does not require prior suffering/experience

27 27 my bias  R&D: C/C++ -> applied ML in robotics, image processing, quality control  SW Development: Java -> Speech Processing, Data Mining  Computational Biology: Java, Python  Other languages I played with: Ada, APL, Basic, MatLab, Modula, Pascal, Perl, Prolog, R, Groovy, Forth, Fortran, Scala, Assembly code


Download ppt "1 why to become a Pyologist Perl is for plumbers – Python is for biologists Stefan Maetschke Teasdale Group."

Similar presentations


Ads by Google