Presentation is loading. Please wait.

Presentation is loading. Please wait.

Matrices A set of elements organized in a table (along rows and columns) Wikipedia image.

Similar presentations


Presentation on theme: "Matrices A set of elements organized in a table (along rows and columns) Wikipedia image."— Presentation transcript:

1 Matrices A set of elements organized in a table (along rows and columns) Wikipedia image

2 Matrices Python does not have direct support for matrix manipulation. For Bio/CS 251 matrices are provided through support.py makeMatrix(rows, cols) # creates a matrix with the # given rows and cols randomMatrix(rows, cols) # creates a matrix with the # given rows and cols with all # cells set to random values getRows(M) # returns the number of rows # of the given matrix getCols(M) # returns the number of cols M[r][c] = # puts 5 in cell (r, c) score = M[r][c] # puts value of cell(r, c) in score

3 Matrices Indexing of rows and columns starts at 0 1 2 3 4 7 4 9
1 2 3 4 7 4 9 >>> M = makeMatrix(3, 5) # creates 3x5 matrix >>> rows = getRows(M) >>> print rows 3 >>> cols = getCols(M) >>> print cols 5 >>> M[0][0] = 7 >>> M[2][4] = 9 >>> M[1][2] = 4 >>> total = M[0][0] + M[2][4] + M[1][2] >>> print total

4 Matrix Processing Fill all cells of a matrix with the number 9
To FILL each cell of a given matrix with the value 9: 1. for each row index in the matrix: 2. for each column index in the matrix: 3. set cell of current row, col to 9 def fillMatrix(M): for r in range(0, getRows(M)): for c in range(0, getCols(M)): M[r][c] = 9 >>> D = makeMatrix(3, 5) >>> fillMatrix(D) >>> print D | |

5 Matrix Processing Add all the values in a matrix
To ADD all cells of a given matrix: set current total to 0 1. for each row index in the matrix: 2. for each column index in the matrix: 3. update total with current cell value 4. return total >>> D = randomMatrix(3, 5) >>> print D | | | | | | >>> total = addElements(D) >>> print total 32 def addElements(M): total = 0 for r in range(0, getRows(M)): for c in range(0, getCols(M)): total = total + M[r][c] return total

6 Sequence Similarity Provides insight about the sequence under investigation – gene-coding regions (DNA), function (proteins) Typically assessed via the process of “sequence alignment” Standard sequence alignment algorithms Dot Plots Global Alignment Semiglobal Alignment Local Alignment Standard software BLAST, FASTA – find high scoring local alignments between query and a target database

7 Dot Plots The simplest method for identifying similarities between two sequence Uses a 2-dimensional table one of the sequences labels the rows the other sequence labels the columns place a ● in each cell that has matching (row, column) labels Example: Dot plot for “GATTACA” and “TACACATTG”

8 Dot Plots G A T C ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

9 Dot Plots G A T C ACA ACATT TACA TAC ATT

10 Dot Plots The simplest method for identifying similarities between two sequence Diagonal lines indicate regions of similarity SE slope – similarity along the direction of the sequences SW slope – similarity along one sequence in reverse Susceptible to noise – especially with DNA since only 4 possible symbols there will be a lot of “random hits” Noise can be addressed using a sliding window consider fragments of length W in the two sequences place ● in each cell that is the “origin” of the sliding window

11 Dot Plots (W = 2) G A T C ? ? ? ? ? ? ? ? ? ? ? ?

12 Dot Plots (W = 2) Compare with next slide with W = 1
G A T C Compare with next slide with W = 1 noise has disappeared one fewer dots per matching region in general if N matches per region, #dots = N – (W-1)

13 Dot Plots (W = 1) G A T C Compare with previous slide with W = 2

14 Self Alignment (W = 1) In self alignment
C In self alignment main diagonal is filled in completely matrix is symmetric about main diagonal

15 Dot Plots Original paper
Maizel JV and Lenk RP: Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc Natl Acad Sci USA 78:7665, 1981. Used a sliding window of odd length centered at the base Our examples used a sliding window anchored at the base G G

16 Dot Plots in Python Compute the dot plot matrix given two sequences
To MAKE a DOT PLOT given two sequences: 1. Create a matrix with rows and columns equal to length of first and second sequence respectively 2. for each row index in the matrix: 3. for each column index in the matrix: 4. if symbol in first sequence equals symbol in second sequence 5. place a dot at current cell 6. return the matrix >>> M = makeDotPlot("GATTACA", "TACACATTG") >>> print M | * * | | * * * | | * | | * * * | | * |


Download ppt "Matrices A set of elements organized in a table (along rows and columns) Wikipedia image."

Similar presentations


Ads by Google