Spinner – a scaffolding tool Spinner uses mate pair data to scaffold contigs. Contigs, and pairs of contigs connected by pairs, define a bi-directional graph: Using expected insert size, a estimate of the gap size can be given for each contig. ftp://ftp.sanger.ac.uk/pub/users/zn1/spinner/
Spinner – walks through a loop These techniques alone produces useful results. Further stages will be used to resolve repeats pairs that “jump over” repeats, and graph flow concepts.
G s = (K n – K s )/D = 1.97x10 9 K n = 80.5x10 9 – Total number of kmer words; Ks = 9.5x10 9 - Number of single copy kmer words; D = 36 - Depth of kmer occurrence Bamboo Genome: Size Estimation
Solexa reads : Number of read pairs: 877 Million; Finished genome size: 2.0 GB; Read length:2x100bp; Estimated read coverage: ~90X; Insert size: 500/50-600 bp; Mate pair data:3k,5k,7k,8k,10k,20k Number of reads clustered:757 Million Assembly features: - stats Contigs Scaffolds Total number of contigs: 744,286 277,278 Total bases of contigs: 1.86 Gb2.05 Gb N50 contig size: 11,622328,698 Largest contig:188,1634,869,017 Averaged contig size: 2,5007,400 Contig coverage on genome: ~90%>95% Bamboo Genome Assembly
Assemblies by pure SOAPdenovo Assemblies by SOAPdenovo & Abyss Rate of single-base difference (# per Kb)2.280.43 Rate of insertion and deletion (# per Kb)0.820.19 Coverage by initial contigs0.760.85 Coverage by supercontigs0.910.94 Bamboo Genome Assembly QC using Finished BACs
Acknowledgements: Joe Henson German Tischler Andrew Whitwham Chinese Academy of Agricultural Sciences Jizeng Jia Guangyue Zhao National Gene Research Centre, Chinese Academy of Sciences Han Bin Hengyun Lu