Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sabegh Singh Virdi ASC Processor Group Computer Science Department

Similar presentations


Presentation on theme: "Sabegh Singh Virdi ASC Processor Group Computer Science Department"— Presentation transcript:

1 Longest Common Subsequence Algorithm on ASC Processors using Coterie Network
Sabegh Singh Virdi ASC Processor Group Computer Science Department Kent State University

2 Presentation Outline Introduction to String matching and its variations Role of LCS in Molecular Biology Overview of LCS Brief introduction on Coterie Network Longest Common Subsequence on Coterie Network Exact match Approximate match Summary and Future work

3 String Matching One of the most fundamental operation in computing.
Comparing two linear arrays of character Application in bioinformatics, searching genetic databases String involved are how ever enormous, efficient string processing is therefore a requirement

4 String Matching Variations
Is Exact match the only solution? What if the pattern does not occur in the text? It still makes sense to find the longest subsequence that occurs both in the pattern and in the text. This is the longest common subsequence problem Longest Common Subsequence, Longest Common Substring, Sequence alignment, Edit distance Problem are all variation of SM problem

5 Presentation Outline String matching and its variations
Role of LCS in Molecular Biology Overview of LCS Brief introduction on Coterie Network Longest Common Subsequence on Coterie Network Exact match Approximate match Summary and Future work

6 Role of LCS in Molecular biology
DNA sequences (genes) can be represented as sequences of four letters A, C, G, and T corresponding to the four submolecules forming DNA When biologists find a new sequences, they typically want to know what other sequences it is most similar to One way of computing how similar (homologous) two sequences are, is to find the length of their longest common subsequence

7 Role of LCS in Molecular biology
This is a simplification, since in the biological situation one would typically take into account not only the length of the LCS, but also e.g. how gaps occur when the LCS is embedded in the two original sequences. An obvious measure for the closeness of two strings is to find the maximum number of identical symbols (preserving symbol order) This by definition, is the longest common subsequence of the strings

8 Overview of LCS Algorithm
Given two strings, find the LCS common to both strings. Example: String 1: AGACTGAGGTA String 2: ACTGAG AGACTGAGGTA - -ACTGAG list of possible alignments - -ACTGA - G- - A- -CTGA - G- - A- -CTGAG - - - The time complexity of this algorithm is clearly O(nm);

9 Overview of LCS Algorithm
Actually this time does not depend on the sequences u and v themselves but only on their lengths The bottleneck in efficient parallelization of LCS problem are the calculating the value of diagonal elements, as shown As seen, the value of {i,j} depend upon the previous element {i-1,j-1}, when a match is found

10 Overview of LCS Algorithm
Possibility of more then one LCS Associate some parameters The Smith-Waterman Algorithm uses the same concept that of LCS algorithm, but gives us the optimal result

11 Overview of LCS Algorithm
A G A C T G A G G T A A C T G 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 1 1 1 2 3 3 3 3 3 3 3 1 2 2 2 3 4 4 4 4 4 4 1 2 3 3 3 4 5 5 5 5 5 1 2 3 3 3 4 5 6 6 6 6

12 Communication between PE’s
In 2D mesh network, Communication between P.E’s themselves take place in two different ways By using the nearest neighbors mesh interconnection network Powerful variation on the nearest-neighbor mesh called the “Coterie network”, developed in response to the requirement for nonlocal communication Properties significantly different from the usual mesh

13 Presentation Outline Introduction to String matching and its variations Role of LCS in Molecular Biology Overview of LCS Brief introduction on Coterie Network Longest Common Subsequence on Coterie Network Exact match Approximate match Summary and Future work

14 Coteries[ Weems & Herbordt ]
“A small often selected group of persons who associate with one another frequently” Features: Related to other Reconfigurable broadcast network Describable using hypergraphs Dynamic in nature Advantages: Propagation of information quickly over long distances at electrical speed Support of one-to-many communication within coterie, reconfigurability of the coterie

15 5 x 5 coterie network with switches shown in “arbitrary”
PE’s form Coteries 5 x 5 coterie network with switches shown in “arbitrary” settings. Shaded areas denotes coterie (the set of PEs Sharing same circuit)

16 Coterie’s Physical Structure
In the Physical implementation, each PE controls set of switches Four of these switches control access in the different directions (N,S,E,W) Two switches H and V are used to emulated horizontal and vertical buses The last two switches NE and NW are used to creation of eight way connected region N NE NW V E W H WS ES S : Switch

17 Presentation Outline Introduction to String matching and its variations Role of LCS in Molecular Biology Overview of LCS Brief introduction on Coterie Network Longest Common Subsequence on Coterie Network Exact match Approximate match Summary and Future work

18 LCS Algorithm on Coterie Network
A G A C T G A G G T A

19 LCS Algorithm on Coterie Network
A G A C T G A G G T A

20 LCS Algorithm on Coterie Network
A G A C T G A G G T A A G A C T G A G G T A A G A C T G A G G T A A G A C T G A G G T A A G A C T G A G G T A A G A C T G A G G T A Content of each PE’s after MULTICAST operation

21 LCS Algorithm on Coterie Network

22 LCS Algorithm on Coterie Network

23 LCS Algorithm on Coterie Network
Content of each PE’s after MULTICAST operation

24 LCS Algorithm on Coterie Network
A G A C T G A G G T A A C T G 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

25 LCS Algorithm on Coterie Network
A G A C T G A G G T A 1 1 1 1 A C T G 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Inject unique token

26 LCS Algorithm on Coterie Network
We try to refine the algorithm to support approximate matching We make use of tokens The next example demonstrate this problem For the string: Text :AGACTGAGGTA Pattern : ACTAAG

27 Presentation Outline Introduction to String matching and its variations Role of LCS in Molecular Biology Overview of LCS Brief introduction on Coterie Network Longest Common Subsequence on Coterie Network Exact match Approximate match Summary and Future work

28 LCS Algorithm on Coterie Network
A G A C T G A G G T A 1 1 1 1 A C T G 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Inject unique token

29 Token method In this method, we explicitly close the W-S switch based on some condition We inject unique token symbols as shown in the next slide Where this two symbol intersect within a PE’s, we close the W-S switch as shown, Thus we get a path from first row to the last row as shown

30 LCS Algorithm on Coterie Network
A G A C T G A G G T A 1 1 1 1 A C T G 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Inject unique token

31 Presentation Outline Introduction to String matching and its variations Role of LCS in Molecular Biology Overview of LCS Brief introduction on Coterie Network Longest Common Subsequence on Coterie Network Exact match Approximate match Summary and Future work

32 Summary and Future work
We have presented two variation of the lcs algorithm We have Explored a new network for this problem Constant time algorithm for Exact match Approximate algorithm depends upon the diameter of the network

33 Summary and Future work
Optimize the algorithm for Approximate match Implementing the algorithm on FPGA’s model Incorporating the Don’t Care Symbol Extend the idea to support sequence alignment Conserve memory by using encoding scheme We can use Virtual simulation of PEs, in case we ran out of PEs

34 Acknowledgements Professor Walker Professor Baker Professor Weems
Professor Herbordt Professor Piontkivska Committee members for their time Kevin Schaffer, Hong Wang, Shannon Steinfadt, Jalpesh Chitalia, and Michael Scherger

35 THANK YOU

36 Questions….


Download ppt "Sabegh Singh Virdi ASC Processor Group Computer Science Department"

Similar presentations


Ads by Google