Download presentation
Presentation is loading. Please wait.
1
Massively Parallel Solutions for Molecular Sequence Analysis Bertil Schmidt School of Computer Engineering, Nanyang Technological University, Singapore Heiko Schröder School of Computer Science and Information Technology, RMIT University, Melbourme, Australia Manfred Schimmler Institut für Datentechnik und Kommunikationsnetze, TU Braunschweig, Germany
2
Contents zMotivation zSmith-Waterman Algorithm zParallelization on the Hybrid Architecture zParallelization on the Fuzion 150 zPerformance Evaluation zConclusion and Future Work
3
zGenetic sequence databases are growing exponentially zGrowth rate will continue, since multiple concurrent genome projects have begun, with more to come Motivation
4
zDiscovered sequences are analyzed by comparison with databases zComplexity of sequence comparison is proportional to the product of query size times database size Analysis too slow on sequential computers z Analysis too slow on sequential computers zTwo possible approaches yHeuristics yHeuristics, e.g. BLAST,FastA, but the more efficient the heuristics, the worse the quality of the results yParallel Processing yParallel Processing, get high-quality results in reasonable time
5
Full Genome Comparison zrelated Organisms, but Tuberculosis causes a disease find common and different parts z16 10 6 pairwise sequence comparisons zMany Genome-Genome Comparisons will be required in the near future 3918 Protein Sequences 1.329.298 AminoAcids 4289 Protein Sequences 1.359.008 AminoAcids
6
Protein Sequence Alignment zBLAST, FastA, Smith-Waterman GGHSRLILSQLGEEG.RLLAIDRDPQAIAVAKT....IDDPRFSII GGHAERFL.E.GLPGLRLIGLDRDPTALDVARSRLVRFAD.RLTLV |||::::| : |::| ||:::||||:|:|||:: ::| |:::: BLAST FastA Smith- Waterman Slower Faster Search Speed Data Quality LowerHigher
7
Smith-Waterman Algorithm zOptimal local alignment of two sequences zPerforms an exhaustive search for the optimal local alignment yComplexity O(n m) for sequence lengths n and m zBased on the 'dynamic programming' (DP) algorithm yFill the DP matrix using a substitution (mutation) matrix yFind the maximal value (score) in the matrix yTrace back from the score until a 0 value is reached
8
Smith-Waterman Algorithm zAligning S1 and S2 of length n and m using Recurrences: zCalculate three possible ways to extend the alignment yby one AminoAcid (AA) in each sequence yby one AA in the first sequence and align it with a gap in the second yby one AA in the second sequence and align it with a gap in the first
9
Smith-Waterman Algorithm ATCTCGTATGATGGTCTATCAC Align S1=ATCTCGTATGATG S2=GTCTATCAC G T C T A T C A C ATCTCGTATGATG 000002100210 0 0 0 0 0 0 0 0 0 0 0000000000000 2 0212114321132 0 0 2 1 0 2 1 1 2 2 4 3 2 1 4 3 2 3 6 5 4 3 6 5 4 5 5 4 4 5 5 4 6 5 7 3 4 4 4 5 5 6 3 5 4 6 5 4 5 3 4 7 5 5 7 6 2 5 6 9 8 7 6 1 4 5 8 8 7 6 0 3 6 7 7 10 9 2 2 5 8 7 9 9 2 1 4 7 7 8 8 8 9 7 5 34 2 0 =1, =1 A T C T C G T A T G A T G G T C T A T C A C G T C T A T C A C
10
Parallel Architectures for Bioinformatics zEmbedded Massively Parallel Accelerators ySystola 1024: PC add- on board with 1024 processors (ISATEC, Germany) yFuzion 150: 1536 processors on a single chip (Clearspeed Technology, UK)
11
Parallel Architectures for Bioinformatics High speed Myrinet switch Systola 1024 ySupercomputer performance at low cost Hybrid Computer ycombines SIMD and MIMD paradigm within a parallel architecture Hybrid Computer
12
Previous Applications zScientific Computing zVolume Visualization zAutomatic Visual Quality Control zCryptography zComputer Tomography zVideo Compression zRange of Transforms (Fourier, Wavelet, Hough, Radon) zComputer Graphics
13
Architecture of Systola 1024 zInstruction Systolic Array: y32 32 mesh of processing elements ywavefront instruction execution
14
14 Instruction Systolic Array + row selectors column selectors instructions * - + - * - + * + + * - + + * * +- + + * - + * + * + * - ++ * * -* - + + * + * - - - + * + * - + * - - zwavefront instruction execution fast accumulation operations (e.g. row sum, broadcast, ringshift)
15
Parallelization of Smith- Waterman zmatrix cells along a single diagonal are computed in parallel zcomparison is performed in A+B 1 steps on A PEs G T C T A T C A C ATCTCGTATGATG 000002100210 0 0 0 0 0 0 0 0 0 0 0000000000000 2 0212114321132 0 0 2 1 0 2 1 1 2 2 4 3 2 1 4 3 2 3 6 5 4 3 6 5 4 5 5 4 4 5 5 4 6 5 7 3 4 4 4 5 5 6 3 5 4 6 5 4 5 3 4 7 5 5 7 6 2 5 6 9 8 7 6 1 4 5 8 8 7 6 0 3 6 7 7 10 9 2 2 5 8 7 9 9 2 1 4 7 7 8 8 0 000 0 2 0 0 1 1 4 2 2 2 0 3 2 1 3 2 1 5 2 4 3 B A P1P1 P2P2 P 13
16
Mapping onto Systola 1024 a 30 a 31 a0a0 a 63 a 62 a 32 a 992 a 1022 a 1023 b k ….b 1 b 0 …c 1 c 0 X b b: subject sequence a a: query sequence (equal to 1024) zSubject sequences can be pipelined with only 1 step delay k steps for subject sequence of length k zEfficient routing on the ISA: Row Ringshift and Broadcast
17
Performance Evaluation zScan times in seconds for TrEMBL 14 (351’834 Protein Sequences) for various query sequence lengths Query sequence length256512102420484096 Systola 1024 speedup to PIII 850 294 5 577 6 1137 6 2241 6 4611 6 Cluster of 16 Systolas speedup to PIII 850 20 81 38 86 73 91 142 94 290 94 zParallel implementation scales linearly with sequence length and number of PCs zComputing time dominates data transfer time
18
Fuzion 150 Architecture z0.25- m, single-chip, SIMD architecture z1536 PEs @ 200 MHz 300 GOPS z600 GB/s on-chip, 6.4 GB/s off-chip bandwidth zMultithreading (control units interact via semaphores) zdeveloped by Clearspeed Technology (UK) for graphics, networking processing Linear SIMD Array 1536 PEs each with 2 Kbytes DRAM Linear SIMD Array 1536 PEs each with 2 Kbytes DRAM FUZION Bus 32-bit EPU (ARC) 32-bit EPU (ARC) Video I/O Video I/O Display Instruction Fetch SIMD Controller Local Memory Local Memory 1,2 or 4 Channels (6.4 GB/s) Host AGPRambus
19
Fuzion 150 Architecture PE (0,0) PE (0,1) PE (0,255) Fuzion Bus PE (1,0) PE (1,1) PE (1,255) PE (5,0) PE (5,1) PE (5,255) Local Memory Local Memory Block 5 Block 1 Block 0 ALU (8 bits) Register file 32 Bytes PE Memory 2 KByte DRAM Right PE Instructions Block I/O Channel Left PE
20
Mapping onto the Fuzion 150 Block 5 Block 1 Block 0 b b: subject sequence b k ….b 1 b 0 a1a1 a0a0 a 255 a 511 a 510 a 256 a 1280 a 1534 a 1535 a a: query sequence (equal to 1536) …c 1 c 0 X zNo fast global communication 2-step local communication zSubject sequence can be pipelined with only step delay
21
Mapping onto the Fuzion 150 zReduce communication time yAssign 16 AAs to each PE query lengths up to 24576 AAs can be processed within a single pass zPartitioning for query lengths <24576: yeach subarray of corresponding size computes the alignment of the same query sequence with different subject sequences
22
Performance Evaluation zScan times in seconds for TrEMBL 14 (351’834 Protein Sequences) for various query sequence lengths Query sequence length256512102420484096 Fuzion 150 speedup to PIII 850 12 136 22 151 42 157 82 163 162 165 zParallel implementation scales linearly with sequence length zComputing time dominates data transfer time
23
Performance Evaluation zNormalized time Comparison for a 10 Mbase search on different parallel architectures with different query length z4 faster than 16K-PE MasPar z6 faster than Kestrel z5 faster than SAMBA (special-purpose 3-board architecture)
24
Performance Evaluation for Full Genome Comparison zScan times for pairwise protein sequence comparison of Mycobacterium Tuberculosis and Escherichia Coli Cluster of Systola 1024 speedup to PIII 850 17 min 79 Fuzion 150 speedup to PIII 850 11 min 133 zComparison has to be performed for several parameters (Substitution matrices, gap penalties) zMycobacterium Smegmatis will be published later this year zResults of the comparison will be interpreted with the Centre for Molecular Cell Biology, NUS, Singapore
25
Conclusions and Future Work zDemonstrated how fine-grained parallel architectures can be applied efficiently for Comparative Genomics zSignificant runtime savings for genome comparisons and database searching More Discovery Is Possible at a good price-performance ratio zOther Computational Biology applications of interest to us: yClustalW yHMM ypattern matching algorithms, such as inverted repeats, short tandem repeats, etc zAvailability of accelerators as a special-resource in a Grid Environment
26
Contents zProtein Structure zProtein Structure Prediction zApproach based on Local Protein Structure zRefinements zConclusions and Future Work
27
Protein Structure zProteins are large molecules composed of smaller molecules called amino acids zThere are 20 kinds of amino acids found in natural proteins zAll share a common structure R side chain carboxyl groupamine group alpha carbon (with attached hydrogen)
28
Protein Structure
29
From Primary to Tertiary Structure zA protein’s 3D shape is determined by its primary amino acid sequence (Anfinsen, 1963) zPredicting tertiary structure from amino acid sequence is an unsolved problem yDifficult to model the energies that stabilize a protein molecule yConformational search space is enormous
30
Prediction Methods zGiven an amino acid sequence: ysearch a set of known folds by aligning sequence and a template fold representative ypredict the fold that gets the best scoring alignment Target amino acid sequence Template Fold library YLAADTYK Template amino acid sequence FISSETCNMEPSSYVTGLIRKN Target/template Score: 7212
31
Prediction Methods zThis method is very effective when target and template have >30% sequence identity zApproximately 1/3 of protein sequences can be assigned folds and modeled this way zOur aim is to contribute to determine tertiary structures in case matching sequences cannot be found
32
Local structure and prediction zWhat is Local structure ? ydescribes environment of an amino acid yan amino acid’s relationship to neighbors zwe use this information to predict structure from primary sequence
33
Dihedral Angles zThe 6 atoms in each peptide unit lie in the same plane and free to rotate The structure of a protein is almost totally determined, if all angles and are known
34
Idea of our Approach zStiff free zlocal predictability database of sub-chain structures zreduction of the number of degrees of freedom by 10, reduces the computation time significantly in combination with a global optimization algorithm (e.g. GA or SA) Side chains Back bone CC and C N
35
Classification of Dihedral Angles Selected PDB structures Dihedral angle extraction Histogram for each amino acids pair stiff multiple flexible
36
Classification of Dihedral Angles ALA-ALA Frequency LEU-ARG Frequency GLY-ILE Stiff multiple flexible
37
Classification of Dihedral Angles Selected PDB structures Dihedral angle extraction Histogram for each amino acids pair stiff multiple flexible zStiff angles: determine mean value zMultiple angles: determine sequence of mean values, one for each peak in decreasing order of these peaks zFlexible angles: determine mean value and mark as flexible
38
Prediction based on Classification zGiven a sequence of amino acids, find the subsequence in which all angles are of type stiff zpredict structure of these subsequences, using the mean values of the corresponding histograms
39
Prediction based on Classification zPart of a protein predicted with this method (backbone of a helix, original structure on the left, predicted structure on the right) zSuccessfully predicted certain stiff structures of subsequences up to the length of 15
40
Refinement of the method zFor multiple angles: yconsider sequences of length 3 or 4: extract sequences (C,A,B,D) and determine the histogram of angles and related to the peptide chain between A and B if histogram for for amino acids (A,B) is multiple, check if angle for (A,B,C,D) is stiff xwith longer subsequences the occurrences of these sequences drops dramatically
41
Refinement of the method zFor multiple angles: yif an amino acid sequence has only a small number of multiple edges, it is possible to try all combinations of possible peaks ymany combinations lead to collisions in part of the protein, and thus can be eliminated
42
Conclusion and Future Work zPresented a method to predict stiff structures of subsequences up to the certain length zPresented a refinement of the method to handle multiple angles zhow to handle flexible angles ? zUsing the local prediction as an input for a global optimization method, e.g. based on Simulated Annealing
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.