Presentation is loading. Please wait.

Presentation is loading. Please wait.

J. B. Cole Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA Use of NGS.

Similar presentations


Presentation on theme: "J. B. Cole Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA Use of NGS."— Presentation transcript:

1 J. B. Cole Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA john.cole@ars.usda.gov Use of NGS to identify the causal variant associated with a complex phenotype

2 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (2) Cole Overview l Why are we sequencing? l How did we select the animals to sequence? l What are the steps involved in the process? l What do you do with the reads once you have them? l Where are we now?

3 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (3) Cole Introduction l Several studies (Kuhn et al., 2003; Cole et al., 2007; Seidenspinner et al., 2009) have reported QTL on BTA 18 associated with dystocia l Bioinformatic analysis using SNP data has not identified the causal variant l Next generation sequencing (NGS) has recently been used to find causal variants for novel recessive disorders

4 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (4) Cole Chromosome 18 is different l Markers on chromosome 18 have large effects on several traits: w Dystocia and stillbirth: Sire and daughter calving ease and sire stillbirth w Conformation: rump width, stature, strength, and body depth w Efficiency: longevity and net merit l Large calves contribute to reduced lifetimes and decreased profitability

5 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (5) Cole Marker effects for dystocia complex AR-BFGL-NGS-109285 Cole et al., 2009 (J. Dairy Sci. 92:2931–2946) ARS-BFGL-NGS-109285 Cole et al., 2009 (J. Dairy Sci. 92:2931–2946)

6 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (6) Cole Correlations in dystocia complex

7 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (7) Cole The QTL also affects gestation length Maltecca et al. 2011. Animal Genetics, 42:6, 585-591.

8 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (8) Cole Overview of the dystocia complex l The key marker is ARS-BFGL-NGS-109285 at (rs109478645 ) 57,585,121 Mb on BTA18 l Intronic to SIGLEC12 (sialic acid binding Ig-like lectin 12) l Recent results indicate effects on gestation length (Maltecca et al., 2011) and calf birth weight (Cole et al., unpublished data)

9 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (9) Cole This is a gene-rich region http://useast.ensembl.org/Bos_taurus/Location/View?r=18%3A57583000-57587000 http://www.ncbi.nlm.nih.gov/gene?cmd=Retrieve&dopt=Graphics&list_uids=618463

10 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (10) Cole Copy number variants are present l ARS-BFGL-NGS-109285 is flanked by CNV w There’s a loss and a gain to the left (8 SNP region) w There’s a gain to the right (10 SNP region) l This can result in assembly problems Hou et al. 2011. Genomic characteristics of cattle copy number variations. BMC Genomics. 12:127.http://www.biomedcentral.com/1471-2164/12/127

11 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (11) Cole Where did this problem come from? http://aipl.arsusda.gov/CF-queries/Bull_Chromosomal_EBV/bull_chromosomal_ebv.cfm? 40,803 daughters

12 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (12) Cole What if we look at a different trait? l Cole et al. (2007) proposed the following mechanism: w SIGLEC12 may sequester circulating leptin w This increases gestation length w Calf birth weight (BW) is higher because of increased gestation length w Higher BW is associated with dystocia

13 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (13) Cole We don’t have birth weight data l Birth weights are not routinely recorded in the US l Collaborated with Hermann Swalve’s group to develop a selection index prediction of BW PTA l Performed GWAS and gene set enrichment analysis to search for interesting associations

14 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (14) Cole GWAS for birth weight PTA h Cole et al.(2013), unpublished data

15 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (15) Cole Are we measuring anything new? l Identified a SNP intronic to LHX4, which is associated with cow body weight and length (Ren et al., 2010, Mol. Bio. Reprod., 37:417-422). l 4 SNP in the QTL region on BTA 18 had large effects l Several other SNP with large effects intronic or adjacent to genes with unknown functions

16 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (16) Cole KEGG pathways for birth weight What does regulation of the actin cytoskeleton have to do with birth weight in cattle? That is, do these results make sense? Maybe…these pathways may be involved in establishment & maintenance of pregnancy, as well as coordination of growth and development. Cole et al.(2013), unpublished data

17 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (17) Cole Sequencing is becoming very affordable

18 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (18) Cole Sequencing successes at AIPL/BFGL l Simple loss-of-function mutations w APAF1 – Spontaneous abortions in Holstein cattle (Adams et al., 2012) w CWC15 – Early embryonic death in Jersey cattle (Sonstegard et al., 2013) w Weaver syndrome – Neurological degeneration and death in Brown Swiss cattle (McClure et al., 2013)

19 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (19) Cole Original pedigree-based design Bull A (1968) AA, SCE: 8 Bull B (1962) AA, SCE: 7 MGS Bull H (1989) Aa, SCE: 14 Bull I (1994) Aa, SCE: 18 Bull E (1982) Aa, SCE: 8 Bull F (1987) Aa, SCE: 15 Bull C (1975) AA, SCE: 8 δ = 10 Bull D (1968) ??, SCE: 7 MGS Bull E (1974) Aa, SCE: 10 MGS

20 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (20) Cole Modified pedigree & haplotype design Bull A (1968) AA, SCE: 8 Bull B (1962) AA, SCE: 7 MGS Bull H (1989) Aa, SCE: 14 Bull I (1994) Aa, SCE: 18 Bull E (1982) Aa, SCE: 8 Bull F (1987) Aa, SCE: 15 Bull C (1975) AA, SCE: 8 δ = 10 Bull E (1974) Aa, SCE: 10 MGS Bull J (2002) Aa, SCE: 6 Bull K (2002) Aa, SCE: 15 Bull J (2002) aa, SCE: 15 These bulls carry the haplotype with the largest, negative effect on SCE: Bull D (1968) ??, SCE: 7 Couldn’t obtain DNA:

21 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (21) Cole DNA Quality Control Molecular prep Sample Collection DNA Extraction Library Construction Library Quality Control

22 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (22) Cole Sample preparation time is substantial l DNA Extraction: ~12 hours (30 mins) l DNA QC: ~1-2 hours (1-2 hours) l Library Construction: 48 hours (12 hours) l Library QC: ~2-4 hours (1 hour) l Total: 3-4 days (15.5 hours)

23 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (23) Cole DNA quality

24 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (24) Cole Library quality

25 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (25) Cole Sequencing stage Illumina cBot: Preps DNA for sequencing Takes 4-5 hours Must be done 48 hours before Illumina HiSeq 2000: Does the sequencing Takes ~10-14 days for 100 x 100 Minimal hands-on time

26 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (26) Cole Anatomy of a flow cell l 8 lanes per flow cell w 3 columns per lane − 96 tiles per column l Each tile imaged 8 times w 1 from upper surface, 1 from lower l Approximately 300Gb of sequence per flow cell http://www.qbi.uq.edu.au/images/genomics/genomics1.jpg

27 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (27) Cole Sequencing by synthesis https://www.broadinstitute.org/files/shared/illuminavids/sequencingSlides.pdf

28 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (28) Cole How many scientists does it take…

29 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (29) Cole Flowcell 1: Cluster densities Cluster densities from current HiSeq run finished 30 April 2013 (unpublished data):

30 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (30) Cole Flowcell 2: Cluster densities Cluster densities from current HiSeq run started 22 May 2013 (unpublished data):

31 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (31) Cole The Aftermath l Total Time (sample to sequence): w 3 weeks w That’s assuming nothing went wrong! w More realistic: months l Resulting Data w Large text files w ~300 gigabytes compressed l Analysis w Often underestimated w Can take months as well

32 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (32) Cole Variant detection Alignment against a reference genome Analysis is very disk I/O-intensive. Variant Detection Raw Sequencer Output Alignment to the Genome

33 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (33) Cole Computational Logistics l Desktop computers w Viable for single lanes w Long computation time l Servers are better w >100GB RAM and >16 processor cores l Cloud w Amazon Web Services w iAnimal/iPlant

34 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (34) Cole Storage considerations l What to save? w Raw data? w Processed results? l How much workspace? l Suggestions: w Workspace 10x compressed files w Save alignments w Backup REGULARLY!!!

35 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (35) Cole Why should you use a pipeline? Automates analysis Maximizes resource consumption Because post-docs aren’t cheap

36 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (36) Cole l Galaxy server l NextGene l Custom pipeline w Scripting languages w Open-source tools Many options for analysis pipelines

37 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (37) Cole Challenges l Annotation w This is a mess in the cow w The reference assembly may not be representative of all taurine cows l Validation w Doing functional genomics with large mammals is expensive – who pays? w When have we proven something?

38 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (38) Cole Conclusions l Sequencing is powerful, but presents many challenges l Computational requirements are substantial l We’re learning how much we don’t know about functional genomics in the cow l Validation remains a problem

39 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (39) Cole Acknowledgments l AIPL: Derek Bickhart, Dan Null, Paul VanRaden l BFGL: Reuben Anderson, Steve Schroeder, Tad Sonstegard, Curt Van Tassell

40 Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (40) Cole Questions? http://gigaom.com/2012/05/31/t-mobile-pits-its-math-against-verizons-the-loser-common-sense/shutterstock_76826245/


Download ppt "J. B. Cole Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA Use of NGS."

Similar presentations


Ads by Google