Presentation is loading. Please wait.

Presentation is loading. Please wait.

Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) David Rodgers.

Similar presentations


Presentation on theme: "Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) David Rodgers."— Presentation transcript:

1 Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) David Rodgers

2 Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) Developed by DPI&F in conjunction with GRDC projects AWCMMP Project ET8 II Pedigree-based genome mapping for marker assisted selection and recurrent parent recovery in wheat and barley Emma Mace, Phillip Banks, David Butler, Bert Collard, Mandy Christopher, Ian DeLacy, Mark Dieters, David Douglass, Jerry Franckowiak, David Jordan, Paul McGowan, Julie McKavanagh, Greg Platz, David Poulsen, David Rodgers, Tracey Shatte and John Shepherd

3 Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) Desktop tool –Not a data repository Integrate –Pedigree Parse / generate purdy style pedigree strings Manage aliases Graphical pedigree –Marker Graphical Genotype Colour coded for IBD or IBS Infer missing data –Analysed Phenotypic data Chart on Year x Site x Trial Type x Genotype

4 Delivery – Plant Science PBMASS

5 Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) Molecular marker concepts based on work done by Dr David Jordan – DPI&F principal Sorghum Breeder Freely available on request DPI&F breeding programs - Sorghum (224,000 genotypes), Barley (56,400 genotypes), Mungbean (2,200 genotypes), Chickpea (2,800 genotypes) and Wheat (18,600 genotypes) PBMASS has been extremely effective in standardising and correcting genotype names and managing pedigrees in each of these breeding programs

6 Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) Technical –C++ GUI Borland C++ builder IDE –Microsoft Access backend –MS Windows Survey

7 Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) Methodologies

8 Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) IBD V’s IBS Pedigree Inference of missing state data Flanking marker IBD inference Naming / Crossing tool Data Volume handling

9 Delivery – Plant Science Identity By Descent (IBD) V’s Identity by State (IBS) IBS – individuals assigned an identity based on allele size (state) –The same state may have resulted from separate mutational events IBD –individuals assigned an ancestral identity based on a combination of pedigree and IBS –expressed as the ancestral genotype determined to be the source of the allele –Recursive process Specified number of generations limits the depth of recursion –Reliant on density of data –More positive association between the marker and the trait –Coupling phase linkage

10 Delivery – Plant Science IBD –Both parents have same IBS as the genotype Calculate IBD for Parents –If both parents have the same IBD assign this IBD to the offspring –Otherwise we cannot assign an IBD to the offspring, its IBD is itself –Includes when the IBS of one parent is unknown FP IBS IBD MP IBS IBD G IBS IBD FP IBS IBD MP IBS IBD G IBS IBD FP IBS IBD MP IBS IBD G IBS IBD

11 Delivery – Plant Science IBD Genotype has same IBS as one parent and parents are different –Assign the IBD of the matching parent to the genotype Genotype does not match either parent –IBD genotype = genotype FP IBS IBD MP IBS IBD G IBS IBD FP IBS IBD MP IBS IBD G IBS IBD

12 Delivery – Plant Science PBMASS IBD verification

13 Delivery – Plant Science Pedigree Inference Try to infer genotype G from parents –IBS FP = IBS MP => IBS G = IBS FP FPMP G O1O2O3On P2 1 P2 O1 1 O2 1 O3 1 On 1 O1 m O2 m O3 m On m P2 m

14 Delivery – Plant Science Pedigree Inference Progeny != P2 => IBS G = IBS progeny where progeny != P2 FPMP G O1O2O3On P2 1 P2 O1 1 O2 1 O3 1 On 1 O1 m O2 m O3 m On m P2 n

15 Delivery – Plant Science Pedigree Inference Progeny all the same as P2 and parents unknown Probability G = P2 increases with number of progeny FPMP G O1O2O3On P2 1 P2 O1 1 O2 1 O3 1 On 1 O1 m O2 m O3 m On m P2 m P(G = P2) = 1 – 0.5 n Where n = number of offspring

16 Delivery – Plant Science Pedigree Inference Progeny all the same as P2 and parents differ One parent same as P2 calculate for each P2 select highest probability FPMP G O1O2O3O4 P2 1 P2 O1 1 O2 1 O3 1 O4 1 O1 n O2 n O3 n O4 n P2 n P(G = P2) = 1 – ((0.5)(0.5 n )) Where n = number of offspring assuming no selection

17 Delivery – Plant Science Virtual Genotypes Flanking Marker Inference –Generate large amounts of marker data from a small number of real datapoints. –By strategically choosing markers to be genotyped we can obtain good genome coverage at a greatly reduced cost. –Using a consensus map virtual genotypes can be created across marker types eg. dArt markers can be used to infer SSR markers.

18 Delivery – Plant Science Virtual Genotypes Flanking Marker Inference –NRP IS3614 population –1262 markers x 46 genotypes -> possible 58,000 –Produced 35,000 IBD values from 12,000 real data points –Increase Linkage distance (5cM) and flanking marker distance (40 cM) parameters -> infer more values at lower confidence. –MKY/BNS population –Infered 81,432 IBD values from 8,256 data points

19 Delivery – Plant Science

20

21 Virtual Genotypes Flanking Marker Inference –Find previous informative (has IBD) marker –Find next informative marker –If the flanking markers within a specified distance have the same IBD Infer unknown IBD to be the same as that of the flanking markers –Or the closest informative marker is within linkage distance –Distance currently set by user Need to calculate probability of cross-over occuring between the unknown and the flanking marker If flanking markers differ for IBD use one with lowest probability

22 Delivery – Plant Science Flanking Marker Inference Unknown Next Informative marker less than linkage distance from unknown Previous Informative marker Next Informative marker

23 Delivery – Plant Science Data Volume Issues Marker Data –dArt data files Combined netcdf file and relational database Transfer from service provider to client –csv/text files –Currently single datapoint per marker x genotype –Expecting multiple datapoints per marker x genotype MS Excel –2003 - 65,536 rows by 256 columns –2007 - 1,048,576 rows by 16,384 columns Pedigree data –Thousands of crosses generated every year –Current Sorghum PBMASS – 222,000 genotypes

24 Delivery – Plant Science NetCDF (Network Common Data Format) ICIT project 1 mill marker data points -> 3 mill by end of project plus phenotypic data UCAR www.unidata.ucar.edu –University Corporation for Atmospheric Research Fast array based data access Combination of relational database and netCDF –Use relational database to index into netCDF array

25 Delivery – Plant Science NetCDF Relational Database Unknown number of plates Unknown number of markers applied to each well Score, Quality and type for each well x marker Wells - unlimited Markers unlimited Score Quality Type NetCDF file

26 Delivery – Plant Science Current Development Algorithm optimizations to enhance performance QTL overlay Verification of existing pedigrees Prediction of possible corrections to pedigree errors Marker confidence level calculations Crossing tool

27 Delivery – Plant Science Database Systems Overview

28 Delivery – Plant Science Naming/Crossing Tool The Key to reducing nomenclature errors Standardised naming convention Maintain traceability Automated recording of –filial generation –Cross information –Location –source Eliminate human intervention –Typos –Excelisms –Intentional name mangling

29 Delivery – Plant Science Naming/Crossing Tool Manage cross information Year, Filial generation, breeding method, origin, location, program etc. Generate genotype names internally –Fully configurable naming format/s Combinations of database fields and text –Filial Generation, Location, Cross Number, Origin, year etc. Eg. {C}YY.NNN>FFSS -> C07.005>F301 Automatic source tracking –Seed packet and/or plot/pot Generate diallele and factorial crosses

30 Delivery – Plant Science SeedManagement Barcoded seed inventory system –Weight and location of seed –Store user defined data for each barcode –Powerful query manager Links to crossing tool and PBMASS –Generate barcoded labels for Existing genotypes New crosses – update status of cross when weight is recoreded

31 Delivery – Plant Science References Jordan, D. R. 1999. Application of molecular markers to Australian Sorghum breeding programs Jordan, D.R., Tao, Y., Godwin, I.D., Henzell, R.G.,Cooper, M., & McIntyre, C.L. 1999 Marker based pedigree analysis in grain sorghum I. Comparison of identity by descent and identity by state for detecting genetic regions under selection Jordan, D.R., Tao, Y.Z., Godwin, I.D., Henzell, R.G.,Cooper, M., & McIntyre, C.L. 2004 Comparison of identity by descent and identity by state for detecting genetic regions under selection in a Sorghum breeding program

32 Delivery – Plant Science Conclusion Thanks


Download ppt "Delivery – Plant Science PBMASS (Pedigree Based Marker Assisted Selection System) David Rodgers."

Similar presentations


Ads by Google