Presentation is loading. Please wait.

Presentation is loading. Please wait.

SOAP3-dp Workflow.

Similar presentations


Presentation on theme: "SOAP3-dp Workflow."— Presentation transcript:

1 SOAP3-dp Workflow

2 Pair up the seed alignments
SOAP3-dp workflow for paired-end alignment Step 1: Use SOAP3 to align pair-ended reads paired alignments Paired-end reads chr 6, +4,059, -4,369; …………. SOAP3 (2-mismatch) ………………….. Step 2: For reads with one end mapped but another not, use Default-DP to align the unmapped ends One ends’ alignments Default-DP chr 9, +49,538; ……….. …….…. paired alignments mapped region of one end candidate region for the unmapped end chr 9, +49,538, -49,829; …………. + The unmapped ends chr 9 + 49,538 ………. use DP to align Step 3: For reads with both ends unaligned, use SOAP3 to align the seeds and then use Deep-DP to align both ends seed alignments of first end seed alignments of second end SOAP3 (1-mismatch) chr 18, +349,683; …………. + chr 18, -349,998; …………. seeds Pair up the seed alignments Deep-DP paired alignments candidate region chr 18 paired seed alignments chr 18, +349,664, -349,923; …………. 349,683 + 349,998 - chr 18, +349,683, -349,998; …………. use DP to align

3 Step 1: SOAP3 Both ends can be mapped and paired properly
Report the alignments Only one end can be mapped with not too many hits (i.e. <= 30) Store the readID (of aligned end) and hits to ARRAY A SOAP3 (2-mismatch) Only one end can be mapped with too many hits (i.e. > 30) A read pair is paired properly if: Both ends are mapped within the insert size (i.e. a range of distance between two ends inputted by the user). In proper orientation (for illumina reads, the end aligned to left side is in forward strand, while another aligned to right in reverse strand.) Store the readID ( of aligned end) and hits to ARRAY B both ends cannot be mapped Store the readID (of the first read of the pairs) and hits to ARRAY C Both ends can be mapped but not paired properly Store the readID and hits to ARRAY A or B (describe more in next slides)

4 Step 1: SOAP3 -- Both ends can be mapped but not paired properly
read 1 read 2 Not paired properly YES YES Let x = # of all valid hits of read 1 Let y = # of all valid hits of read 2 If x > 30, only retains the best hits of read 1 and reset x = # of best hits of read 1. If y > 30, only retains the best hits of read 2 and reset y = # of best hits of read 2. a) x,y <= 30 YES NO NO YES ARRAY A b) x <= 30 < y YES NO ARRAY A c) y <= 30 < x NO YES ARRAY A d) 30 < x < y YES NO ARRAY B e) 30 < y <= x NO YES ARRAY B Store the read ID and hits of YES to ARRAY A or B

5 default DP and new default DP
Step 2 and step 3: default DP and new default DP Both ends can be mapped and paired properly Report the alignments Otherwise Default DP Array A Store the readID of the first read of the pairs to ARRAY C Both ends can be mapped and paired properly Report the alignments Otherwise New default DP Array B Store the readID of the first read of the pairs to ARRAY C

6 Detailed picture of Default DP and New Default DP
For reads with one end mapped but another not, AND the number of hits is not too many, use Default-DP to align the unmapped ends One ends’ alignments Default-DP chr 9, ; ……….. …….…. paired alignments mapped region of one end candidate region for the unmapped end chr 9, , ; …………. + The unmapped ends chr 9 + 49538 ………. use DP to align For reads with one end mapped but another not, AND the number of hits is too many, use New-Default-DP to align the unmapped ends seed alignments of unmapped end One ends’ alignments The unmapped ends SOAP3 (1-mismatch) chr 18, ; …………. chr 18, ; ……….. …….…. seeds + seeds Pair up the seed alignments with the alignments of another end New-Default-DP paired alignments candidate region chr 18 chr 18, , ; …………. 349683 + 349998 - chr 18, , ; …………. mapped region of one end use DP to align

7 Step 4: 2-level Deep DP ARRAY C ROUND 1 SEEDING for both ends
Seed length: 26 Sample rate: 1/13 Max # of hits allowed: 100 If (1) there exists a seed with too many hits; AND (2) no pairs of hits within insert size. If there exists pairs of hits within insert size. If there exists pairs of hits within insert size. Perform DP for those pairs of hits within insert size. ROUND 2 SEEDING for both ends Seed length: 30 Sample rate: 1/15 Max # of hits allowed: 1000 Case 1: Valid paired alignments found Report the alignments Case 2: No valid paired alignment found Store the readID of both ends to ARRAY D

8 Report the ends cannot be aligned
Step 5: Single DP The end can be mapped Report the alignments Otherwise Single DP Array D Report the ends cannot be aligned

9 Detailed picture of Single DP
seed alignments SOAP3 (1-mismatch) chr 18, +349,683; …………. seeds Single-DP Report the alignments Candidate region Chr18 chr 18, +349,664; …………. 349,683 + use DP to align

10 Paired-end alignment (overall workflow) Load 6M reads (3M pairs)
SOAP3 (2-mismatch) Create a new CPU thread to load next 6M reads New default DP Note: New-default DP needs 2BWT in GPU, while default DP does not. Thus we run new-default DP before default DP, because after SOAP3, 2BWT index is already inside GPU. Default DP 2-level deep DP single DP Yes More reads to process? No END

11 SOAP3 Architecture …….. …….. Host (CPU) Device (GPU) 2BWT + SA 2BWT
Execution Host (CPU) Execution Device (GPU) Memory-resident data structures Memory-resident data structures 2BWT + SA 2BWT Process 1M reads for round 1 and round 2 alignments Process 1M reads for round 1 and round 2 alignments Process round 3 alignment & Report results Process round 3 alignment & report results Process 1M reads for round 1 and round 2 alignments Mention hard reads may be processed in multiple GPU rounds Mention there are many cases Process round 3 alignment & report results Process 1M reads for round 1 and round 2 alignments Process round 3 alignment & report results Process 1M reads for round 1 and round 2 alignments …….. ……..

12 DP with seeding …….. …….. Host (CPU) Device (GPU) 2BWT + SA
Execution Host (CPU) Execution Device (GPU) Memory-resident data structures Memory-resident data structures 2BWT + SA 2BWT / DP tables Copy 2BWT index to GPU & Extract seeds of reads in Array C SOAP3 (1-mismatch) Process 1M seeds for round 1 and round 2 alignments Process 1M seeds for round 1 and round 2 alignments Process round 3 alignment Mention hard reads may be processed in multiple GPU rounds Mention there are many cases …….. …….. Pair-up the seed alignments, Clear 2BWT index in GPU & Create DP tables in GPU Perform DP between the reads and the candidate regions

13 Perform DP between the reads and the candidate regions
Default DP Execution Host (CPU) Execution Device (GPU) Memory-resident data structures Memory-resident data structures 2BWT + SA DP tables Create DP tables in GPU Perform DP between the reads and the candidate regions Mention hard reads may be processed in multiple GPU rounds Mention there are many cases

14 Load 6M single-end reads
Single-end alignment (overall workflow) Load 6M single-end reads SOAP3 (2-mismatch) Create a new CPU thread to load next 6M reads single DP Yes More reads to process? No END

15 Paired-end alignment (For read length > 150)
Load 6M reads (3M pairs) 2-level deep DP Create a new CPU thread to load next 6M reads single DP Yes More reads to process? No END


Download ppt "SOAP3-dp Workflow."

Similar presentations


Ads by Google