Presentation is loading. Please wait.

Presentation is loading. Please wait.

Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 4 Karsten Hokamp, PhD Genetics TCD, 01/12/2015.

Similar presentations


Presentation on theme: "Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 4 Karsten Hokamp, PhD Genetics TCD, 01/12/2015."— Presentation transcript:

1 Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 4 Karsten Hokamp, PhD Genetics TCD, 01/12/2015

2 Trinity College Dublin, The University of Dublin Overview List, tuple, set Exercises Weekly task http://bioinf.gen.tcd.ie/GE3M25/

3 Trinity College Dublin, The University of Dublin Recap Branching if … : elsif : else : Comparisons x > 1 File I/O

4 Trinity College Dublin, The University of Dublin Recap Calculate sequence lengths from a file with Fasta-formatted DNA entries Input: fasta_seqs.txt on website Output as ID and length: chrI 230218 chrII 813184 chrIII 316620 chrIV 1531933 chrIX 439888...

5 Trinity College Dublin, The University of Dublin Recap >chrI actcaactcatcacctcatcatcaactcatc tgtgcggcgtaatacgatatagcactacgac … tgaagagccccacgactcagc >chrII tgggtagataatagagtatatatagtatata … ttagagccagtagtacgcagaccataca >chrMito ttagaatgcctaggccatcagcgcacatcat … gagatagga

6 Trinity College Dublin, The University of Dublin Lists: Ordered collection of variables bases = list(seq) Indexed just like strings bases[0] == seq[0] Explicit creation: columns = [1, 9, 10, 12] nucleotides = [ 'a', 'c', 'g', 't' ]

7 Trinity College Dublin, The University of Dublin Lists vs strings:

8 Trinity College Dublin, The University of Dublin Exercise: 1.Create an empty list called 'dna1' 2.What happens if you try to access dna1[0] ? 3.Create a variable 'seq' containing a DNA string 4.Create a list 'dna2' from the DNA string 5.Print the first element from dna2 6.Print the last element from dna2

9 Trinity College Dublin, The University of Dublin Modifying list elements: Special methods for list objects 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'  Find out more through help() function Accessing specific elements: elem = bases[2] Modifying specific elements: bases[2] = 'A'

10 Trinity College Dublin, The University of Dublin Exercise: 1.Copy dna2 into dna1 2.Change the last element of dna1 to an 'n' 2. What does dna1.clear() do? 3. What's the difference between append and extend? 4. Try to duplicate the list dna2 5. Add an 'n' to the end of dna2 6. Remove the last element from dna2 7. Sort dna2 8. Reverse the order of dna2

11 Trinity College Dublin, The University of Dublin Built-in Functions for Lists: all(), any(), len(), max(), min(), sorted(), sum(), zip()  Find out more using help()

12 Trinity College Dublin, The University of Dublin Exercise: 1. Create a list of numbers: nums = list(range(1,101)) 2. Calculate the number of elements with len() 3. Calculate the sum of all elements with sum() 4. Print the largest and smallest element 5. Print the list in reverse order 5. What does zip do?

13 Trinity College Dublin, The University of Dublin Tuples: Just like lists, but can't be changed Use of round brackets during creation bases = tuple(seq)

14 Trinity College Dublin, The University of Dublin Set: Similar to lists, but only shows unique elements  Compare the following seq = 'aatcgcgactacgcac' bases1 = list(seq) bases2 = set(seq)

15 Trinity College Dublin, The University of Dublin Exercise: Read in a DNA sequence in FASTA format from a file Print out a count of all the letters found in the sequence Tip: use set()

16 Trinity College Dublin, The University of Dublin Exercise: - Read in a file with numbers - Report the line count - Report the min and max number - Calculate the average of the numbers

17 Trinity College Dublin, The University of Dublin Exercise: - Read in a file with probe ids, gene ids, fold- change and p-values, separated by tab - Report the line count - Report the min and max fold-change - Calculate the average of the fold-change

18 Trinity College Dublin, The University of Dublin Exercise: - Read in a file with probe ids, gene ids, fold- change and p-values, separated by tab - Print the header into a new file - Print all the lines with absolute fold-change > 2 and p-value <= 0.05 into that file

19 Trinity College Dublin, The University of Dublin Weekly task: Read in a DNA sequence in FASTA format from a file Prompt the user for a short motif Split the sequence at the sites that match Print the fragment lengths in sorted order


Download ppt "Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 4 Karsten Hokamp, PhD Genetics TCD, 01/12/2015."

Similar presentations


Ads by Google