Affine Gap Alignment Marcus Bamberger
The Goal Write a global-align function Build off of the 6.1 lab Incorporate Affine gaps
Alignment Graph as 3 Levels How can we emulate this path in the 3-level graph? ε loweri-1,j - ε middlei-1,j - σ σ loweri,j = max { σ upperi,j-1 - ε middlei,j-1 - σ upperi,j = max { loweri,j middlei-1,j-1 + score(vi,wj) upperi,j middlei,j = max { ε
Data used Toy sequences UCSC genome – human chromosome 1 Repeated motif, interspersed with junk GCATGCATGCAT and ACACACGCATACACACGCATACACACGCATACACAC UCSC genome – human chromosome 1 IntronExonIntronExonIntron ExonExonExon
gapAlign procedure Inputs: string1, string2, gapStart, gapContinue, match, misMatch. String1 >= string2 Backtrack tables and scoring tables all constructed simultaneously Diagonal route, upper right to lower left Alignment constructed according to scoring results, following backtrack table Alignment printed, score given
Results gapContinue <= -0.1, all other scores 1/-1 “Clumping” occurs when gap creation/extension is overly penalized Precise values depend on gap length necessary for motif matching Setting gapContinue to 0 fixes these, assuming your string is constrained Compared to my 6.1 alignment: CTCGAGTCTAGAGCATTCGAGTCTAGAAGCATTCGAGTCTAGAAGCATTCGAGTCTAGAAGCATTCGAGTCTAGAA -------------------------GCAT------------------------GCAT-------------------------GCAT-------------------------GCAT------------------------ -------------------------G---------C---A------T------G----------C---A------T------G---------C---A------T------G----------C--A-------T-----