Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cross_genome: Assembly Scaffolding using Cross-species Synteny Zemin Ning High Performance Assembly.

Similar presentations


Presentation on theme: "Cross_genome: Assembly Scaffolding using Cross-species Synteny Zemin Ning High Performance Assembly."— Presentation transcript:

1 Cross_genome: Assembly Scaffolding using Cross-species Synteny Zemin Ning High Performance Assembly

2 Can synteny help? And How? Contig gap closure Scaffolding

3 RACA - Reference-assisted chromosome assembly

4 Target sequence Reference Scaffold 1 Scaffold 2 Scaffold 3 Q = scaff(i)*2 32 + contig_loci(j) Lattice of Target - Reference

5 Target sequence Reference Scaffold 1 After Noise Cleaning Y X Gap_size = Y - X Scaffold 2 Scaffold 3

6 Cases Shouldn’t Join Reference Target Scaffold 1 Scaffold 2 Scaffold 1 Gap_size Reference Target

7 AssemblerN_basesN_scaffsN50 (Mb) Original88.841881.6 Allpahts-LGRACA86.8 Cross_genome8922185.5 Original78.614720.37 Bambus2RACA72.1 Cross_genome78.6109413.7 Original86.54980.4 CABOGRACA81.4 Cross_genome86.34685.5 Original89.710940.88 MSR-CARACA83.4 Cross_genome89.613.7 Original94.7309750.075 SGARACA57.4 Cross_genome94.82966277.3 Original108384770.453 SOAPdenovoRACA84.4 Cross_genome102.81295578.9 Original143.8614550.84 VelvetRACA123 Cross_genome139.432788.71 GAGE: Human Chr14 and RACA using Orangutan

8 OriginalCross_gReferences Panda 1.3Mb25MbDog, Human Tibetan Antelope 2.6Mb42MbCattle, Dog, Human Tasmanian Devil 1.8Mb6.8MbOpossum Scaffold N50 for Other Genome Assemblies Availability ftp://ftp.sanger.ac.uk/pub/users/zn1/merge/cross_genome/

9 Improve gorilla assembly using human reference Contig Merge/Break Variation correction Contig gap size re-estimation Read Alignment Pair-wise/Multiple Combined Gorilla- Human Assembly Human Reference Gorilla Assembly Final Gorilla Assembly

10 Gap size New gap size Target sequence Reference sequence Re-estimate Contig Gap Sizes from Reference New gap size Read alignment and variation correction Ref seq inserted

11 Contig Consensus using Gap5 Target (query) aligned against Reference Before

12 Target (query) aligned against Reference Reference Sequence Replacement & Variation Correction

13 Variations: 2 indels (4bp and 1bp) corrected

14 Original Contig (query) against New Assembly after Contig Break

15 Alignment Inconsistency

16 Original Contig (query) against New Assembly after Contig Break

17 Alignment Inconsistency

18 Original New Total number of contigs: 464,875285,139 N50 contig size: 11.7kb23.9kb Largest contig:191,556322,733 Averaged contig size: 60859928 The Gorilla Assemblies

19 Acknowledgements:  Hanness Ponstingl  Frank Liu – Nanjing University of Information Technology (NUIT)  Yan Li – (NUIT)  Gorilla genome sequencing data  BGI – Panda and Tibetan Antelope assemblies


Download ppt "Cross_genome: Assembly Scaffolding using Cross-species Synteny Zemin Ning High Performance Assembly."

Similar presentations


Ads by Google