Tutorial #5 by Ma’ayan Fishelson. Input Format of Superlink There are 2 input files: –The locus file describes the loci being analyzed and parameters.

Slides:



Advertisements
Similar presentations
Genetic Linkage and Recombination
Advertisements

Tutorial #8 by Ma’ayan Fishelson. Computational Difficulties Algorithms that perform multipoint likelihood computations sum over all the possible ordered.
. Exact Inference in Bayesian Networks Lecture 9.
Gene linkage seminar No 405 Heredity. Key words: complete and incomplete gene linkage, linkage group, Morgan´s laws, crossing-over, recombination, cis.
Tutorial #1 by Ma’ayan Fishelson
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4-1 Classical Genetics — Lecture I Dr. Steven J. Pittler.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Tutorial #2 by Ma’ayan Fishelson. Crossing Over Sometimes in meiosis, homologous chromosomes exchange parts in a process called crossing-over. New combinations.
Chapter 11 Mendel & The Gene Idea.
Genetic linkage analysis Dotan Schreiber According to a series of presentations by M. Fishelson.
Basics of Linkage Analysis
. Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
1) Linkage means A) Alleles at different loci are independent B) Alleles at different loci are physically close to each other and on the same chromosome.
Pedigree Analysis.
Genetics & Heredity.
. Learning – EM in ABO locus Tutorial #08 © Ydo Wexler & Dan Geiger.
Tutorial #6 by Ma’ayan Fishelson Based on notes by Terry Speed.
1 How many genes? Mapping mouse traits, cont. Lecture 2B, Statistics 246 January 22, 2004.
Tutorial by Ma’ayan Fishelson Changes made by Anna Tzemach.
Parametric and Non-Parametric analysis of complex diseases Lecture #8
IGES 2003 How many markers are necessary to infer correct familial relationships in follow-up studies? Silvano Presciuttini 1,3, Chiara Toni 2, Fabio Marroni.
. Basic Model For Genetic Linkage Analysis Lecture #3 Prepared by Dan Geiger.
Tutorial #11 by Anna Tzemach. Background – Lander & Green’s HMM Recombinations across successive intervals are independent  sequential computation across.
CASE STUDY: Genetic Linkage Analysis via Bayesian Networks
Reconstructing Genealogies: a Bayesian approach Dario Gasbarra Matti Pirinen Mikko Sillanpää Elja Arjas Department of Mathematics and Statistics
Gene linkage seminar No 405 Heredity. Key words: complete and incomplete gene linkage, linkage group, Morgan´s laws, crossing-over, recombination, cis.
Tutorial #5 by Ma’ayan Fishelson Changes made by Anna Tzemach.
Tutorial #5 by Ma’ayan Fishelson
General Explanation There are 2 input files –The locus file describes the loci being analyzed and parameters for the different analyzing programs. –The.
Linkage Analysis in Merlin
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Genetics: A Conceptual Approach THIRD EDITION Copyright 2008 © W. H. Freeman and Company CHAPTER 6 Pedigree Analysis, Applications, and Genetic Testing.
Standardization of Pedigree Collection. Genetics of Alzheimer’s Disease Alzheimer’s Disease Gene 1 Gene 2 Environmental Factor 1 Environmental Factor.
. Basic Model For Genetic Linkage Analysis Lecture #5 Prepared by Dan Geiger.
1 Mendelian genetics in Humans: Autosomal and Sex- linked patterns of inheritance Obviously examining inheritance patterns of specific traits in humans.
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Pedigree Analysis.
Non-Mendelian Genetics
Calculation of IBD State Probabilities Gonçalo Abecasis University of Michigan.
Punnet Squares, Linked Genes and Pedigrees
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Lecture 15: Linkage Analysis VII
1 B-b B-B B-b b-b Lecture 2 - Segregation Analysis 1/15/04 Biomath 207B / Biostat 237 / HG 207B.
 a visual tool for documenting biological relationships in families and the presence of diseases  A pedigree is a family tree or chart made of symbols.
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Guy Grebla1 Allegro, A new computer program for linkage analysis Guy Grebla.
Human Heredity Chromosomes & Pedigrees. Karyotype x.
AP Biology Discussion Notes Monday 4/4/2016. Goals for the day Be able to predict patterns of inheritance and interpret pedigrees. Be able to use probabilities.
Lecture 17: Model-Free Linkage Analysis Date: 10/17/02  IBD and IBS  IBD and linkage  Fully Informative Sib Pair Analysis  Sib Pair Analysis with Missing.
Mendelian genetics in Humans: Autosomal and Sex- linked patterns of inheritance Obviously examining inheritance patterns of specific traits in humans.
Difference between a monohybrid cross and a dihybrid cross
Part 2: Genetics, monohybrid vs. Dihybrid crosses, Chi Square
PEDIGREE ANALYSIS AND PROBABILITY
Linkage analysis & Homozygosity mapping
Recombination (Crossing Over)
Pedigree Analysis, Applications, and Genetic Testing
Error Checking for Linkage Analyses
Genetic Mapping Linked Genes.
Using Punnett Squares A Punnett square is a model that predicts the likely outcomes of a genetic cross. A Punnett square shows all of the genotypes that.
Section 3: Modeling Mendel’s Laws
Pedigree Analysis.
IBD Estimation in Pedigrees
Linkage Analysis Problems
Genetic linkage analysis
Tutorial #6 by Ma’ayan Fishelson
Presentation transcript:

Tutorial #5 by Ma’ayan Fishelson

Input Format of Superlink There are 2 input files: –The locus file describes the loci being analyzed and parameters for the different analyzing programs. –The pedigree file describes the pedigrees being analyzed. Locus File: The first 3 lines describe some general parameters of the analysis being performed. Following are lines that describe each locus. The end of the file provides recombination information and program-specific information. Pedigree File: Each line in this file describes an individual in one of the pedigrees.

# of loci Chromosome order of the loci Affection Status locus (disease locus) Numbered Alleles locus (marker) Recombination values Program-specific Parameters. Program code # of disease loci (0 or 2) Number of alleles for 1 st locus Gene frequencies for 2 nd locus Number of penetrance classes for Affection Status locus Penetrances Locus File

Pedigree File Pedigree Number Individual’s ID Father’s ID Mother’s ID First child’s ID Next paternal sibling’s ID Next maternal sibling’s ID Sex: 1=male 2=female Disease Status: 0=unknown 1=unaffected 2=affected Penetrance Class Marker Alleles (2 alleles per locus) st marker 2 nd marker

For More information on Superlink visit:

Possible Input Errors Incompatibility between the 2 input files (in the number of loci, in the order of specification of the loci,…) Errors in Locus File: probabilities don’t sum to 1, impossible values for recombination fractions or other probabilities, incompatibility between number of loci and number of loci descriptions… Errors in Pedigree file: no correspondence between child and parent, pointer problems, genotyping errors…

Genotyping Errors Can be divided into 2 types: 1.Errors that can be detected when observing one marker. 2.Errors that can be detected only when observing several adjacent markers.

PedCheck (Jeffrey O’connell and Daniel Weeks) A Program for identification of genotype incompatibilities in Linkage Analysis. Genotype incompatibilities are detected in 4 stages: 1.Level 1: performs checks on the nuclear family level. 2.Level 2: Uses the Lange-Goradia algorithm to perform genotype elimination. 3.Level 3: Determines “critical genotypes”. 4.Level 4: Determines alternative typing for the critical genotypes, and finds the most likely person to be mistyped.

Example 1a – Level 1 errors /3 65 5/12/1 4/9 7 4/3 List the errors. Assume there are 6 alleles at this marker..

Example 1b – Level 1 errors /3 5 4/12/ /4 5 2/12/2 List the errors here.

Level 1 Errors Incompatibility between a child and a parent’s alleles. A person is half-typed. More than 4 alleles in a sibship. More than 3 alleles in a sibship when there is a homozygous child. More than 2 alleles in a sibship when there are 2 different homozygous children. The allele is out of bounds.

Level 2 Errors Performs genotype elimination via an extended version of the Lange-Goradia algorithm for set-recoded genotypes. This algorithm recursively uses the nuclear-family relationships to eliminate invalid genotypes in the pedigree. Continues until no more genotypes can be eliminated. For each pedigree and locus: identifies the first nuclear family with an error that hasn’t been detected in level 1, and outputs the inferred genotype lists.

Example 2 – Level 2 errors /1 5 3/2 4/3

Genotype Elimination Algorithm A.For each pedigree member, save only ordered genotypes compatible with his/her phenotype. B.For each nuclear family: 1.Consider each mother-father genotype pair: a.Determine which zygotes can arise from this pair. b.If each child in the nuclear family has one or more of these zygote genotypes among his or her current genotype list, then save the parental genotypes and any child genotype matching one of the created zygote genotypes. c.If any child has none of these zygote genotypes among his/her genotype list, then don’t save any genotypes. 2.For each person in the nuclear family, exclude any genotypes not saved during step (1). C.Repeat part (B) until no more genotypes can be excluded. A.For each pedigree member, save only ordered genotypes compatible with his/her phenotype. B.For each nuclear family: 1.Consider each mother-father genotype pair: a.Determine which zygotes can arise from this pair. b.If each child in the nuclear family has one or more of these zygote genotypes among his or her current genotype list, then save the parental genotypes and any child genotype matching one of the created zygote genotypes. c.If any child has none of these zygote genotypes among his/her genotype list, then don’t save any genotypes. 2.For each person in the nuclear family, exclude any genotypes not saved during step (1). C.Repeat part (B) until no more genotypes can be excluded.

Genotype Elimination Example O O A

Complete Genotype-Elimination Algorithm A genotype elimination algorithm is complete if it can detect that the set of given genotypes violates Mendelian laws of inheritance. If a complete genotype elimination algorithm finds no errors  the genotypes are consistent with Menelian laws of inheritance.

Genotype Elimination - Another Example.. 6 2/ /22/3 Is the presented genotype elimination algorithm complete ?

Additional Problems.. The inferred genotype lists don’t always permit easy identification of the source of the problem: –The genotype lists may be long. –More than one individual may be the error source. –The error may not be in the nuclear family reported.

Critical Genotypes Genotypes of an individual that eliminate the pedigree inconsistency when removed from the data (i.e., treated as unknown). Note: a critical genotype isn’t necessarily erroneous. Degree n critical genotypes: an n-tuple of genotypes of typed individuals that when treated as unknown simultaneously, the inconsistency is eliminated. The set of erroneous genotypes is a subset of the critical genotypes.

Critical-Genotype Algorithm (Level 3) Attempts to identify the critical genotypes, if any, in the pedigree. “Untypes” one typed individual at a time, and applies the genotype-elimination algorithm to determine if the inconsistency has been eliminated. There may be one or more critical genotypes or there may be none. If there are none, higher-degree critical genotypes can be investigated at a higher cost. If only one critical genotype is found  this genotype represents the error.

Example 3 – Level 3 errors 1/2 3 1/12/

Dilemma… Several critical genotypes have been identified at a locus There’s no way of deciding a priori which one is most likely to be erroneous..

Odds-Ratio Algorithm (Level 4) Algorithm Outline: 1.For each individual with a critical genotype, identify valid typings that eliminate the inconsistency. 2.Compute the likelihood L of the pedigree data for each alternative typing at each critical genotype, holding all other critical genotypes at their original value. 3.Let L max be the largest likelihood obtained. For each alternative genotype compute the odds ratio L max /L. 4.Return each alternative typing together with its odds ratio. Helps distinguish between alternative critical genotypes. Based on single-locus likelihoods of the pedigree.

Example 3 – Level 4 1/2 3 1/12/ Only one consistent alternative typing: 1/2 Two consistent alternative typings: 1/2 & 2/2

Odds-Ratio Algorithm (allele frequencies) There are 3 variations: 1.User-defined allele frequencies. 2.Assume all alleles are equally frequent. 3.Estimate allele-frequencies from typed individuals (leads to a bigger spread in odds ratio).

2 nd Type of Genotyping Errors The pedigree data indicates a certain recombination event in an interval where Ө=0. The pedigree data indicates more (or less) recombination events than expected according to the specified recombination fractions.

Error Detection in Merlin Calculate L(G| Ө) and L(G| Ө=0.5). For each genotype g: –Mark it as unknown. –Calculate L(G\g| Ө) and L(G\g| Ө=0.5). –Compute the ratio r linked = L(G\g| Ө) / L(G| Ө). –Compute the ratio r unlinked = L(G\g| Ө=0.5) / L(G| Ө=0.5). –Compute the statistic r = r linked / r unlinked. –Genotypes that cause inconsistency with neighboring markers result in large values of r. Calculate L(G| Ө) and L(G| Ө=0.5). For each genotype g: –Mark it as unknown. –Calculate L(G\g| Ө) and L(G\g| Ө=0.5). –Compute the ratio r linked = L(G\g| Ө) / L(G| Ө). –Compute the ratio r unlinked = L(G\g| Ө=0.5) / L(G| Ө=0.5). –Compute the statistic r = r linked / r unlinked. –Genotypes that cause inconsistency with neighboring markers result in large values of r.

Genotype Elimination in Superlink Superlink’s algorithm is composed of 2 types of algorithms: –Downward traversal algorithm in which the children are updated according to the parents. –Upward traversal algorithm in which the parents are updated according to the children. Genotypes are stores as 2 lists of alleles:  Possible paternal alleles.  Possible maternal alleles. Genotypes are stores as 2 lists of alleles:  Possible paternal alleles.  Possible maternal alleles.

Downward Traversal Algorithm Traverses the pedigree in such a manner that a child is updated by his parent only after the parent has been updated. The update is performed as follows: –If nothing is known about the child’s genotype, add all the possible alleles of the parent to the child’s relevant allele. –Else, check for each possible allele of the child if it is possible according to the parent.

Example: Downward Update | 2 The child 3 can only receive alleles 1 or 2 from his father (2).

Upward Traversal Algorithm Traverses the pedigree in such a manner that a parent is updated by his child only after the child has been updated. The update is performed as follows: –All the alleles that a child got from the parent for certain are marked. –If two alleles have been marked as certain, the rest of the alleles are erased (the genotype has been determined). –Sometimes the genotype is determined including phase.

Example: Upward Update The father (1) must have transmitted alleles 3 & 4 to the children. 3 1 | | 3 1 | 4 The mother (2) could only transmitted allele 1 to the children (3 & 4). 3 | 4