Presentation is loading. Please wait.

Presentation is loading. Please wait.

Python – Essential characteristics

Similar presentations


Presentation on theme: "Python – Essential characteristics"— Presentation transcript:

1 Python – Essential characteristics
think Monty, not snakes! Key Advantages: Open source & free (thank you Guido van Rossum!) Portable – works on Unix, Linux, Win32 & 64, MacOS etc. Easy to learn and logically consistent Lends itself to rapid development So, good for “quick and dirty” solutions & prototypes But also suitable for full fledged applications Hides many low-level aspects of computer architecture Elegant support of object-orientation and data structures Extensive library support – a strong standard library Dynamic “duck typing” paradigm is very flexible Language is minimalistic, only 31 keywords

2 Python – Essential characteristics
Some Disadvantages: It's not very fast (but often better than PERL!) Relatively inefficient for number crunching Can have high memory overhead Being “far from the metal” has disadvantages – systems or kernal programming is impractical Dynamic typing can be both a blessing and a curse Some key libraries are still developing (e.g. BioPython) Version 3 breaks compatibility to prior versions Some find the whitespace conventions annoying Tends towards minimalism in favour of expressiveness

3 Becoming a Pythonista www.python.org/getit
Windows and MacOS X installers available at: Note that BNFO602 will be using version 2.73, not more recent 3.xx distributions Even if your machine supports 64 bit, a 32- bit install is generally a safer choice for compatibility Linux users may possibly need to download a source tarball and compile themselves

4 A Python IDE for BNFO602 www.jetbrains.com/pycharm
Windows, MacOS X, and Linux installers at: We are using the Free community edition An IDE is an Integrated Development Environment While not strictly required, IDEs ease and facilitate the creation and management of larger programs. IDLE is the built-in IDE and is another option Python can also be run interactively.

5 For version 2.X, official documentation and tutorials are here:
Documents for Python For version 2.X, official documentation and tutorials are here: docs.python.org/2 docs.python.org/2 While a notable weakness of Python in the past, the online documentation and tutorials for Python are now quite good! StackOverflow.com also has good information: stackoverflow.com/tags/python/info

6 The Building Blocks of Python - Hello World!
print "Hello World" No semicolon! Keywords Function Argument Python 2.7 has only 31 keywords in the language. It is minimalistic.

7 Hello World! if True: print "Hello" print "World" Statement Block
Does NOT use curly brackets to delimit statement blocks! Use colon after conditional statement if True: print "Hello" print "World" Statement Block If statements are the sentences of Python, then statement blocks are analogous to paragraphs. Unlike PERL, python is somewhat fussy about how we use whitespaces (spaces, tabs, line breaks).....

8 Statement blocks are nested using whitespace
#Demo of nested blocks print "Outer level" if True: print "\tMiddle level #1" print "\t\tInner level" print "\tMiddle level #2" pass print "Outer level #2" Comments begin with # Escape sequence for “tab” (but no variable interpolation as w/ PERL) Dummy statement Whitespace delimits statement blocks! Preferred practice is to use exactly four spaces Don't use tabs unless your editor maps these to spaces!

9 Statement blocks can be nested
Output Outer level Middle level #1 Inner level Middle level #2 Outer level #2 Yes, this is a trivial example. Note: scoping within these simple blocks is a little different than PERL as there is no “my” statement for local variables

10 Data Types in Python Some basic data types
String delimiters "Hello World!" 42 3.1459 2+6j False, True None String Integer Floating point Complex Boolean Null Some types, like strings, are hard-coded and cannot be directly changed! They are “immutable”

11 Data Types in Python Some compound data types
delimiters ["A", "C", "G", "T"] ("A", “C", "G", "T") {"A":"T", "C":"G", "G":"C", "T":"A"} list tuple dict A tuple is essentially an immutable list whereas a dict is like a PERL hash

12 Variables in Python Variables in Python are NOT associated to a type
They are just identifiers that name some object Identifiers begin with a letter or underscore dna_sequence = "AGCTAGC" seq_len = 9 symbols = ["A", "G", "C", "T"] empty_dict = {} symbols = {"A":"Adenine"} Declaration and definition are usually coincident

13 Data Types and identifiers
print A print "The answer is ", A[0] Index notation always uses square brackets even if a tuple or a dict Output [42, 32, 64] The answer is 42 Data types are actually implemented as a classes that know how to print their own instance objects. Later we'll see how to make our own classes and types

14 Operators, Operands & Expressions
subexpression var = 12 * 10 expression operators Expressions consist of valid combinations of operands and operators, and a sub-expression can act as an operand in an expression Very similar to PERL, but some operators vary, especially for the logical operators. Also string concatenation uses "+", not "."

15 Expressions Expressions can use the result of a function
(or the result of a method of a class) as an operand foo = somefunction(foo2) foo = somefunc(foo2) * foo3 foo = somefunc(foo2) + somefunc2(foo3) foo = somefunc(somefunc2(foo2)) All of the above are possibly legal Python expressions depending on the functions

16 See documentation for complete details
Some Python Operators Common operators + - / * Addition subtraction division multiplication concatenation 4 + 2 = 6 4 – 2 = -2 4 / 2 = 2 4 * 2 = 8 "4" + "2" = "42" = assignment Does NOT denote equivalence Use == for testing equivalence! Operators follow a strict order of operations: e.g * 2 = 16 See documentation for complete details

17 The Assignment Operator
Unlike in algebra, does not imply that both sides of the equation are equal! The following is a valid Python statement: var = var + 1 This also does the same thing: var += 1 *=, -=, /=, all work the same way. This says “take the current value of var and add one to it, then store the result back in var”

18 Incrementing and Decrementing
The following are functionally equivalent statements: var = var + 1 var += 1 Similarly: Increment by shown amount var = var - 1; var -= 1 But NOT: var++, ++var or var--, --var No PERL style autoincrement/decrement!

19 The Equivalence Operator
Python does have an equivalence operator Print "Is 2 equal to 4:", 2 == 4 print "Is 2 equal to 2:", 2 == 2 equivalence operator Output: Is 2 equal to 4: False Is 2 equal to 2: True Python has a built-in Boolean type! 0, Boolean False, None, empty lists, null strings, and empty dicts are all evaluated as false

20 Comparison Operators The equivalence operator is just one of the comparison operators == equal to < less than > greater than <= less than or equal to >= greater than or equal to != or <> not equal to These are the comparison operators for everything Use caution when testing floating point numbers, especially for exact equivalence!

21 Flow Control – if, else and conditional expressions
Comparison operators enable program flow control dna = "GATCTCTT" dna2 = "GATCTCCC" if dna == dna2: print "Sequences identical:", dna Conditional expression note the colon else: print "Sequences different" Output: Sequences different

22 Flow Control – if, else and conditional expressions
Comparison operators at work #2 dna = "ATGCATC" if dna: print "Sequence defined" else: print "Sequence not defined" Output: Sequence defined non-None, non-zero, non-False, & non-empty results are logically “true”

23 Flow Control – if, else and conditional expressions
Comparison operators at work dna = "" if dna == "ATG": print "Sequence is ATG start codon" else: print "Sequence not defined" Output: Sequence not defined Remember, empty lists and null strings are logically equivalent to “false”

24 Multi-way branching using elif
dna = "ATG" if dna == "GGG": print "All Gs" elif dna == "AAA": print "All As" elif dna == "TTT": print "All Ts" elif dna == "CCC": print "All Cs" else print "Something else:", dna Several elif blocks in a row is OK! Output: Something else: ATG

25 Loops with the while statement
dna = "ATGCATC" while dna == "ATGCATC": print "The sequence is still", dna Conditional expression Output: The sequence is still ATGCATC etc… while statements will execute their statement block forever unless the conditional expression becomes false. Therefore the variable tested in the conditional expression is normally manipulated within the statement block..

26 Loops with the while statement
returns the length of a string dna = "ATGCATGC" while len(dna): print "The sequence is:", dna dna = dna[0:-1] print "done" conditional expression More on “slice notation” later when discussing lists. Here we remove the last character of a string Output: The sequence is ATGCATGC The sequence is ATGCATG The sequence is ATGCAT The sequence is ATGCA The sequence is ATGC The sequence is ATG The sequence is AT The sequence is A done

27 Use break to simulate PERL until
dna = "A" while True: if len(dna) > 3: break print "The sequence is:", dna dna += "A" print "done" len is one of several built-in functions string concatenation and assignment Output: The sequence is A The sequence is AA The sequence is AAA done There is no native “do-while” or “until” in Python Python is minimalistic

28 Loops with the for statement
nt_list = ("A", "C", "G", "T") for nt in nt_list: print "The nt is:", nt Output: The sequence is A The sequence is C The sequence is G The sequence is T for loops iterate over list-like (“iterable”) data types and are similar to PERL foreach, not the PERL or C for

29 Try this example with a string instead of a list!
Loops with the for statement nt = ("A", "C", "G", "T") for index in range(len(dna)): print "The nt is:", dna[index] Caution! range in 2.x instantiates an actual list. Use xrange if iteration is big Output: The sequence is A The sequence is C The sequence is G The sequence is T for loops can have a definite number of iterations typically using the range or xrange built-in function Try this example with a string instead of a list!

30 Data Types in Python - Strings
Strings are string-like iterables with a rich collection of methods for their manipulation dna = "ACGT" Some useful methods are: join, split, strip, upper, lower, count dna = "ACGT" dna2 = dna.lower() # will give "acgt" “attribute” notation! These are methods specific to the string type, not of general utility like built-ins

31 Data Types in Python - Strings
Strings are string-like iterables with a rich collection of methods for their manipulation dna = "ACGT" Some useful methods are: join, split, strip, upper, lower, count dna = "AACGTA" print dna.count(“A”) # will give 3

32 Data Types in Python - Lists A list is simply a sequence of objects
enclosed in square brackets that we can iterate through and access by index. They are array-like. ["A","G","C","T"] Unlike PERL, pretty much anything can be put into a list, including other lists!! Mirabile dictu! [42,"groovy", dna, 3.14, var1-var2, ["A", "G", "C", "T"]] Try printing item 5 from the above list….how does this differ from the result you would get in PERL?

33 Data Types in Python - lists
A list is a powerful type for manipulating lists: bases = ["A","G","C","T"] No token to distinguish list variables!! list elements can be accessed by an index: index = 2 print bases[0], bases[index] Note that first element is index 0 Output: AC Assigning to a non-existent element raises an error exception There is no PERL-style “autovivication” (although we can fake this)

34 Data Types in Python - Lists
Lists also have rich collection of methods Some useful methods are: len, sort, reverse, in, max, min, count Note that some are built-in functions while others use attribute notation pi = 3.14 my_list = ["ACGT", 0, pi] print min(list) # will print 0 min and max are built-ins

35 Data Types in Python - Lists
Lists also have rich collection of methods Some useful methods are: len, sort, reverse, in, max, min, count Note that some are built-in functions while others use attribute notation my_list = ["A", "C", "G", "T"] my_list.reverse() print my_list # will print ["T", "G", "C", "A"] attribute notation

36 Data Types in Python - Lists
Lists also have rich collection of methods Some useful methods are: len, sort, reverse, in, max, min, count my_list = ["A"] * 4 #init with 4 "A"s print my_list.count("A") # prints 4 my_list.append("C") if "C" in my_list: print 'The list contained "C"\n' testing for inclusion with in is a common operation with all iterable types

37 Lists and slice notation
Slices allow us to specify subarrays bases = ["A","G","C","T"] size = len(bases) # will be equal to four var1, var2, var3, var4 = bases #var1="A" & var2="G", etc. Slice indices refer to the space between elements! subarray = bases[0:2] #subarray = ["A","G"] subarray = bases[0:-1] #subarray = ["A","G","C"] subarray = bases[1:] #subarray = ["G","C","T"] subarray = bases[1:len(bases)] #subarray = ["G","C","T"] Array “slices” can be assigned to a subarray

38 Lists modification and methods
Some useful list methods are: append, insert, del, sort, remove, count, reverse, etc. bases = ["A","G","C"] bases.append("T") # bases = ["A","G","C","T"] bases.sort() # bases = ["A","C","G","T"] num_of_As = bases.count("A") # num_of_As = 1 bases[:0] = ["a","g","c","t"] Slice notation can be used to modify a list! Try this on the previously defined bases list and see what happens

39 dictionaries a.k.a. dicts
Data Types in Python - dictionaries a.k.a. dicts dicts are associative arrays similar to PERL hashes: complement = {"A" : "T", "C" : ”G", "G" : ”C”, "T" : ”A”} no PERL “%” token to distinguish hash identifiers!! The left hand is the dict key and must be unique, “hashable”, and “immutable” (this will become clearer later) On right hand is the associated value. It can be almost ANY type of object! Nice.

40 Working with Dicts dicts are a preferred data type in Python
#A dict for complementing a DNA nucleotide comp = {"A" : "T", "C" : "G", "G" : "C", "T" : "A"} print "complement of A is:", comp["A"] print "complement of C is:", comp["C”] It’s easy to add new pairs to the hash: comp["g"] = "c" Output: complement of A is: T complement of C is: G Or to delete pairs in the hash: comp.del("g")

41 Other dict methods Some useful dict methods are:
keys, values, items, del, in, copy, etc. #A hash for complementing a DNA nucleotide comp = {"A" : "T", "C" : "G", "G" : "C", "T" : "A"} print comp.keys() # might return.. ["A","C”,"G","T"] No assertion is made as to order of key/value pairs!

42 The point is that dicts are unordered, and no guarantees are made!!
Dicts are iterable #Iterating over hashes comp = {"A": "T", "C" : "G", "G" : "C", "T" : "A"} for k, v in comp.items(): print 'complement of', k, 'is', v .items() returns a two-element tuple that is “unpacked” here into k and v iterate over both keys and values together! Or output could be: complement of C is G complement of A is T complement of T is A complement of G is C Output could be: complement of A is T complement of C is G complement of G is C complement of T is A The point is that dicts are unordered, and no guarantees are made!!

43 Tuples are essentially immutable lists
In most read-only contexts, they work just like lists you just can't change their value nucleotides = ("A", "C","G", "T") for NT in nucleotides: print NT , "is a nucleotide symbol" tuples are delimited by () Packing and unpacking: (one, two, three) = (1, 2, 3) print one # prints 1 Why Tuples? The immutable nature of tuples means they do not need to support all list operations. They can therefore be implemented differently, are consequently more efficient for certain operations. And only immutable objects can serve as hash keys

44 An example of tuples as dict keys
Sparse matrices An example of tuples as dict keys Standard multidimensional array: matrix = [ [3,0,-2,0], [0,9,0,0], [0,7,0,0], [0,0,0,-5] ] print matrix[0][2] # This will print -2 # Not very memory efficient if there are many zero valued # elements in a very large matrix!!! Sparse matrix representation: matrix = { (0,0): 3, (0,2): -2, (1,1): 9, (2,1):7, (3,3):-5 } print matrix.get( (0,2), 0) # prints -2 # The get method here returns 0 if the key is undefined # Much more memory efficient, since zero values not stored

45 Functions Q: Why do we need Functions?
A: Because we are lazy! Functions are the foundation of reusable code Repeatedly typing out the code for a chore that is used over and over again (or even only a few times) would be a waste of time and space, and makes the code hard to read Functions in Python akin to subroutines in PERL as well as procedures in some other languages

46 Functions Defining a function
Minimally, all we need is a statement block of Python code that we have named def I_dont_do_much: #any code you like!! pass return Capital letters OK A return value is optional, None is default if value isn’t specified or no explicit final return statement Once defined, functions are called (“invoked”) just by stating its name, and passing any required arguments: I_dont_do_much()

47 Functions Python has several flexible ways to pass arguments to function. This example is just the most basic way! Warning! Python passes objects to functions by reference, never by copy. Changes to mutable objects in the function change the starting object!! No messing weirdness like in PERL def expand_name (amino_acid): convert = {"R" : "Arg", "A" : "Ala", etc.} if amino_acid in convert: three_letter = convert[amino_acid] else: three_letter = "Ukn" return three_letter expand_name(“R”) convert is local to the function (i.e. in lexical scope) Note indentation – line is not part of function definition, but rather is an invocation of the function Output: Arg

48 Using external functions
Python includes many useful libraries or, it can be code that you have written In Python its easy to use functions (or indeed other variables or objects) that are defined in some other file… Option 1: import module_name # use the module name when calling the function.. # i.e. module_name.function(arg) Option 2: from module_name import name1, name2, name3 # imports just the names you want # no need to refer to module name when calling Option 3: from module_name import * # imports all of the public names in a module

49 Putting it all together - An in-class challenge
Get Python up and running, try “Hello world!” then… Write a program that: Defines a function that generates random DNA sequences of some specified length given a dict describing the probability distribution of A, C, G, T -- should be familiar from BNFO601 You’ll need the rand function from the math library!! This is a real-world chore that is frequently encountered in bioinformatics


Download ppt "Python – Essential characteristics"

Similar presentations


Ads by Google