Presentation on theme: "Sea Urchin Genome. Origin The sea urchins (Class Echinoidea) are one of five extant classes within the phylum Echinodermata. S. purpuratus belongs to."— Presentation transcript:
Sea Urchin Genome
Origin The sea urchins (Class Echinoidea) are one of five extant classes within the phylum Echinodermata. S. purpuratus belongs to a complex of 10 species that arose in the North Pacific during a recent burst of speciation and ecological diversification about Mya (1-3).
Deuterostome The sea urchin genome currently provides the only outgroup within the deuterostomes for comparison with chordate genomes, and thus, for defining deuterostome-specific, chordate-specific, and pan-bilaterian genes.
Bilateria The Bilateria include the Deuterostomia and other superclades: the Ecdysozoa (insects, nematodes and many lesser groups); the assemblage known as the Lophotrochozoa (mollusks, annelids, brachiopods and many other groups); and the basal acoel flatworms
Radially symmetric Adult: Body Plan As in all deuterostomes, the adult sea urchin has mesodermal sheets covering both gut and inner body wall; it has a gut divided into pharynx, esophagus, stomach and intestine; it has a large population of wandering immune effector cells, many macrophage-like but of other forms as well, within its coelom; it has an anus, gonopores, and a mouth with (five) teeth.
Anterior-posterior Axis the expression patterns of posterior hox genes, indicate that the anterior end of the sea urchin is the oral, ventral surface on which it crawls, and the anal or posterior end of the body plan is the uppermost, dorsal surface.
Bilateral Larvae The bilaterally symmetric embryo is more easily understood; its organization is literally transparent. The S. purpuratus embryo/larva, complete three days after fertilization, consists of only about 15 cell types, including neurons, skeletogenic cells, a few muscle cells, gut cells, and some mesenchymal mesoderm cells, all arranged in structures of single-cell thickness.
Metamorphosis On metamorphosis the larva settles down, most larval tissues are resorbed, and the juvenile emerges. Afterwards, the digestive tract is re-grown in a 4 new position defining the adult dorso-ventral axis, mouth down. The pentameral oral structures and the water vascular system of the adult develop within the rudiment before metamorphosis.
Intraspecific Variation Intraspecific variation in the unselected regions of the DNA sequence of this species was measured at 4-5% (compare to human at about 0.5%, let alone inbred strains of mice). Polymorphism posed enormous problems for the sequence assembly, and required a new strategy.
WGS (Whole Genome Shotgun Sequencing) When the WGS reads reached 3x coverage, an initial assembly was created (v 0.1; 9/2003).
Assembling WGS produce enough WGS sequence for 6-fold sequence coverage of the genome In the case of the sea urchin, it was already known that single-copy DNA varied as much as 4-5%
Atlas Genome Assembly
Atlas Genome Assembly Steps in the Atlas Assembly System. (1) Trim off vector and low-quality portions of reads. (2) Count k-mers in WGS reads, saving the overall distribution plus specific counts for k-mers with copy number above a threshold. (3) Align BAC and WGS reads sharing rare k-mers and save overlap edges for high-quality end-to-end alignments. (4) Enrich each BAC read set with overlapping WGS reads and their mates, assemble using Phrap, scaffold and check for consistent assembly. (5) Arrange BACs into contiguous sets (bactigs) and flag problem BACs for closer quality checking. (6) Assemble bactigs in waves designed to limit the number of BACs that are input to Phrap. (7) Treating each bactig scaffold as a unit, rescaffold to produce superbactigs, detecting problem joins and missed merges between bactigs. (8) Link superbactigs together into ultrabactigs based on remaining (single) mate-pair links, fingerprint contigs, markers and synteny with human and mouse genomes. (9) Format chromosome files with contigs separated by strings of Ns representing gaps. Quality-control feedback steps include (10) examining coassembly scores of problem BACs and removing foreign trays of reads; (11) resolving superbactig conflicts by modifying bactigs and possibly flagging BACs for closer checking; and (12) resolving ultrabactig and mapping conflicts in collaboration with research groups that generated FPC and marker information.
Assembling WGS Comparison of the assembled contigs and scaffolds to high quality assemblies of 25 BAC clones allowed for characterization of regions of the genome that appeared difficult to assemble (e.g., regions of dissimilarity between haplotypes) and to develop tools to identify such regions and merge them (e.g., Polyjoin). –simple altering of mismatch parameters was not sufficient
SNPs Once all the sequence was available and assembled, a comparison of all WGS reads within each region of the 3-fold WGS Assembly revealed at least one single nucleotide polymorphism (SNP) per 100 bases, and a comparable frequency of insertion/deletion (indel) changes.
SNPs With an 800-base sequencing read the assembler was faced with an average minimum of 16 mismatches per read, or 1 mismatch every 50 bases, in a region that was correctly assembled.
Final WGS The final WGS assembly including the full 6-fold sequence coverage was submitted to GenBank (accession number AAGJ ; 4/2005).
BAC fingerprint map generation Concurrent with the WGS sequencing effort, the BAC library was fingerprinted using Eco RI at the Genome Sciences Centre in Vancouver, Canada (www.bcgsc.ca/lab/mapping/).
BAC fingerprint map generation The 85,820 fingerprinted clones (16.8 fold clone coverage of an 800Mb genome) were analyzed for shared restriction fragments and found to form 5,788 contigs leaving 17,378 singletons. The generated minimal tiling path of 8,248 clones suggested a genome size of approximately 1,015Mb
Clone-array pooled shotgun sequencing (CAPSS) of BAC clones First application Previously, individual DNA preparations and shotgun libraries were made for each BAC clone in the tiling path
Clone-array pooled shotgun sequencing (CAPSS) of BAC clones the CAPSS strategy, the minimal tiling path clones were separated into groups of 576 non-overlapping clones. Sequenced and assembled with Atlas
Probability of a Base Not Sequenced
High Density Picoliter Reactors a, Genomic DNA is isolated, fragmented, ligated to adapters and separated into single strands (top left). Fragments are bound to beads under conditions that favour one fragment per bead, the beads are captured in the droplets of a PCR- reaction-mixture-in-oil emulsion and PCR amplification occurs within each droplet, resulting in beads each carrying ten million copies of a unique DNA template (top right). The emulsion is broken, the DNA strands are denatured, and beads carrying single-stranded DNA clones are deposited into wells of a fibre-optic slide (bottom right). Smaller beads carrying immobilized enzymes required for pyrophosphate sequencing are deposited into each well (bottom left). b, Microscope photograph of emulsion showing droplets containing a bead and empty droplets. The thin arrow points to a 28-m bead; the thick arrow points to an approximately 100-m droplet. c, Scanning electron micrograph of a portion of a fibre-optic slide, showing fibre-optic cladding and wells before bead deposition.
Sequencing The sequencing instrument consists of the following major subsystems: a fluidic assembly (a), a flow chamber that includes the well-containing fibre-optic slide (b), a CCD camera-based imaging assembly (c), and a computer that provides the necessary user interface and instrument control. Nucleotide incorporation is detected by the associated release of inorganic pyrophosphate and the generation of photons5, 125, 12
Pyrosequencing A PCR-amplified ssDNA template is hybridized to a sequencing primer and incubated with the enzymes DNA polymerase, ATP sulfurylase, luciferase and apyrase, and with the substrates adenosine 5´ phosphosulfate (APS) and luciferin. The addition of one of the four deoxynucleotide triphosphates (dNTPs) initiates the second step. DNA polymerase incorporate complementary dNTPs onto the template. This incorporation releases pyrophosphate (PPi) stoichiometrically. ATP sulfurylase quantitatively converts PPi to ATP in the presence of adenosine 5´ phosphosulfate. This ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP. The light produced in the luciferase-catalyzed reaction is detected by a charge coupled device (CCD) camera and this can be analyzed in a program. Each light signal is proportional to the number of nucleotides incorporated. Nucleotide degradation by apyrase removes dNTPs from the solution. New nucleotides can be added and a new cycle can start.