Download presentation

Presentation is loading. Please wait.

Published byNichole Kellow Modified over 2 years ago

1
Contact: Biplot Analysis of Multi-Environment Trial Data Weikai Yan May 2006

2
Weikai Yan 2006 Multi-Environment Trials (MET) MET are essential MET are expensive MET data are valuable MET data are not fully used

3
Weikai Yan 2006 Why biplot analysis? Biplot analysis can help understand MET data –Graphically, –Effectively, –Conveniently

4
Weikai Yan 2006 Outline Multi-environment trial (MET) data Basics of biplot analysis Biplot analysis of G-by-E data Biplot analysis of G-by-T data Better understanding of MET data Conclusions

5
Contact: Multi-environment trial data

6
Weikai Yan 2006 MET data is a genotype-environment-trait (G-E-T) 3-way table Multiple Genotypes Multiple Environments Multiple Traits

7
Weikai Yan 2006 A G-E-T 3-way table contains many 2-way tables G by E: for each trait G by T (trait): in each environment; across environments E by T: for each genotype; across genotypes G-E-T data >> G-E data

8
Weikai Yan 2006 A G-E-T 3-way table is an extended 2-way table G by V: –each E-T combination as a variable (V) P by T: –each G-E combination as a phenotype (P)

9
Weikai Yan 2006 A G-E-T 3-way table implies informative 2-way tables Association by environment 2-way tables –Associations: among traits between traits and genetic markers

10
Weikai Yan 2006 Goals of MET data analysis Short-term goals: –Variety evaluation Response to the environment (G x E) Trait profiles (G x T) Long-term goals: –To understand the target environment (G x E) the test environments (G x E) the crop (G x T) the genotype x environment interaction (A x T)

11
Contact: Basics of biplot analysis Most two-way tables can be visually studied using biplots

12
Weikai Yan 2006 Origin of biplot Gabriel (1971) One of the most important advances in data analysis in recent decades Currently… > 50,000 web pages Numerous academic publications Included in most statistical analysis packages Still a very new technique to most scientists Prof. Ruben Gabriel, The founder of biplot Courtesy of Prof. Purificación Galindo University of Salamanca, Spain

13
Weikai Yan 2006 What is a biplot? Biplot = bi + plot –plot scatter plot of two rows OR of two columns, or scatter plot summarizing the rows OR the columns –bi BOTH rows AND columns 1 biplot >> 2 plots

14
Weikai Yan 2006 Mathematical definition of a Biplot Graphical display of matrix multiplication Inner product property –P ij =OA i *OB j *cos ij –Implies the product matrix A(4, 2) B(2, 3) P(4, 3) Matrix multiplication A1A2 A3 A4 B1 B2 B3 5.0 cos = P11 = 5*4.472* = 20

15
Weikai Yan 2006 Practical definition of a biplot Practical definition of a biplot Any two-way table can be analyzed using a 2D-biplot as soon as it can be sufficiently approximated by a rank-2 matrix. (Gabriel, 1971) G-by-E table Matrix decomposition G1G2 G3 G4 E1 E2 E3 P(4, 3) G(3, 2) E(2, 3) (Now 3D-biplots are also possible…)

16
Weikai Yan 2006 Singular Value Decomposition (SVD) & Singular Value Partitioning (SVP) (0 f 1) Singular values Matrix characterising the rows Matrix characterising the columns SVD = PCA? SVD: SVP: The rank of Y, i.e., the minimum number of PC required to fully represent Y Rows scoresColumn scores Biplot Plot

17
Weikai Yan 2006 Biplot interpretations Inner-product property Interpretations based on biplots with f = 1 approximates YY T, the distance matrix Similarity/dissimilarity among row (genotype) factors Interpretations based on biplots with f = 0 approximates Y T Y, the variance matrix Similarity/dissimilarity among column (environment) factors Combined use of f = 0 and f = 1 (Gabriel, 2002 Biometrika; Yan, 2002, Agron J; Built in the GGEbiplot software)

18
Weikai Yan 2006 Biplot analysis is… to use biplots to display –a two-way data per se (Y), –its distance matrix (YY T ), and –its variance matrix (Y T Y) so that –relationships among rows, –relationships among columns, and –interactions between rows and columns can be graphically visualized.

19
Weikai Yan 2006 Data centering prior to biplot analysis The general linear model for a G-by-E data set (P) –P = M + G + E + GE Possible two-way tables (Y): Y = P = M + G + E + GE original data: QQE biplot Y = P – M = G + E + GE global-centered (PCA) Y = P – M – E = G + GE column-centered: GGE biplot Y = P – M – G = E + GE row-centered Y = P – M – G – E = GE double-centered: GE biplot All models are useful, depending on the research objectives (built in GGEbiplot)

20
Weikai Yan 2006 Data scaling prior to biplot analysis Different GGE biplots Y ij = ( i + ij )/s j S j = 1 no scaling S j = (s.d.) j all environments are equally important S j = (s.e.) j heterogeneity among environments is removed (built in GGEbiplot)

21
Weikai Yan 2006 Four questions must be asked before trying to interpret a biplot 1.What is the model? How the data were centered and scaled? What are we looking at? 2.What is the goodness of fit? How confident are we about what we see? What if the data is fitted poorly? 3.How singular values are partitioned? What questions can be asked? 4.Are the axes drawn to scale? Are the patterns artifacts? (All are addressed explicitly in GGEbiplot)

22
Contact: Biplot Analysis of G-by-E data MEGA- ENVIRONMENT ANALYSIS TESTENVIRONMENTEVALUATION GENOTYPEEVALUATION

23
Weikai Yan 2006 Sample G-by-E data (Yield data of 18 genotypes in 9 environments, 1993, Ontario, Canada)

24
Weikai Yan 2006 Before trying to interpret a biplot… 1.Model selection? Centering = 2 (G+GE) Scaling =0 2.Goodness of fit? 78%. 3.Singular value partitioning? SVP = 2 (environment- metric ) 4.Draw to scale? Yes.

25
Weikai Yan 2006 G By E data analysis MEGA- ENVIRONMENT ANALYSIS TESTENVIRONMENTEVALUATION GENOTYPEEVALUATION Mega-environment is a group of geographical locations that share the same (set of) best genotypes consistently across years.

26
Weikai Yan 2006 Relationships among environments Relationships among environments The Environment-vector view Angle vs. correlation The angles among test environments Environment grouping

27
Weikai Yan 2006 Which-won-where (Crossover GE is GE that caused genotype rank changes and different winners in different test environments) G12 G7 G18 G8 G13

28
Weikai Yan 2006 Are there meaningful crossover GE? Are there meaningful crossover GE? The which-won-where view (Crossover GE is GE that caused genotype rank changes and different winners in different test environments)

29
Weikai Yan 2006 Are the crossover patterns* repeatable? If YES… –The target environment can be divided into multiple mega-environments –GE can be exploited by selecting for each mega- environment –GE G If NO … –The target environment CANNOT be divided into multiple mega-environments –GE CANNOT be exploited –GE must be avoided by testing across locations and years *Not the environment-grouping patterns Mega-environment is a group of geographical locations that share the same (set of) best genotypes consistently across years. Multi-year data are needed

30
Weikai Yan 2006 Classify your target environment into one of three categories With Crossover GENo Crossover GE Repeatable (2) Multiple MEs Select for specifically adapted genotypes for each ME (1) Single simple ME A single test location, single year suffices to select a single best variety Not repeatable (3) Single complex ME Select for generally adapted genotypes across the whole regions across multiple years ME: mega-environment

31
Weikai Yan 2006 G By E data analysis MEGA- ENVIRONMENT ANALYSIS TESTENVIRONMENTEVALUATION GENOTYPEEVALUATION

32
Weikai Yan 2006 Discriminating ability and representativeness Vector length: discriminating ability Angle to the AE: representativeness Average-environment axis Average environment

33
Weikai Yan 2006 Ideal test environments: discriminating and representative Ideal test environment

34
Weikai Yan 2006 Classify each test environment into one of three categories For each good or useful test environment: is it essential? DiscriminativeNot discriminative Representative (2) Good for selecting (more important) (1) Useless Not representative (3) Useful for culling (less important)

35
Weikai Yan 2006 Vector length = discrimination = GE = GE1 + GE2 Contribution to Proportionate GE Contribution to Non- proportionate GE

36
Weikai Yan 2006 G By E data analysis MEGA- ENVIRONMENT ANALYSIS TESTENVIRONMENTEVALUATION GENOTYPEEVALUATION

37
Weikai Yan 2006 Vector length = GGE = G + GE Contribution To GE (instability) Contribution To G (mean performance)

38
Weikai Yan 2006 Mean vs. Stability

39
Weikai Yan 2006 Genotype ranking on both MEAN and STABILITY The ideal genotype

40
Weikai Yan 2006 Genotype classification Mean Stability High mean performance Low mean performance High stabilityGenerally adapted (VERY GOOD) Bad everywhere (VERY BAD) Low stabilitySpecifically Adapted (GOOD) Bad somewhere (BAD) Are there stability genes?!

41
Weikai Yan 2006 G x E data analysis summary 1) Mega-environment analysis 2) Test environment evaluation 3) Genotype evaluation Important comments: –(2) and (3) are meaningful only for a single mega-environment –Any stability analysis is meaningful only for a single mega- environment –Any stability index can be used only as a modifier to the ranking based on mean performance

42
Contact: Other ways to view a GGE biplot

43
Weikai Yan 2006 Inner-product property

44
Weikai Yan 2006 Ranking on a single environment

45
Weikai Yan 2006 Ranking on two environments

46
Weikai Yan 2006 Relative adaptation of a genotype

47
Weikai Yan 2006 Compare any two genotypes

48
Contact: Biplot analysis of Genotype by trait data

49
Weikai Yan 2006 Objectives of G By T data analysis Genotype evaluation based on trait profiles Relationship among breeding objectives

50
Weikai Yan 2006 Data of 4 traits for 19 covered oat varieties (Ontario 2004) (Background info: High yield, high groat, high protein, and low oil are desirable for milling oats)

51
Weikai Yan 2006 Relationships among traits

52
Weikai Yan 2006 Trait profile of each genotype

53
Weikai Yan 2006 Trait profile of a genotype

54
Weikai Yan 2006 Trait profile comparison between two genotypes

55
Weikai Yan 2006 Genotype ranking based on a trait

56
Weikai Yan 2006 Parent selection based on trait profiles

57
Weikai Yan 2006 Independent culling

58
Contact: Fuller understanding of MET data MET data are more informative than you thought

59
Weikai Yan 2006 A G-E-T 3-way dataset contains various 2-way tables G by E data G by T data E by T data: –for each genotype; all genotypes G by V data: –each E-T as a variable (V) P by T data: –each G-E as a phenotype (P) Genetic association by environment data Trait association by environment data

60
Weikai Yan 2006 Genetic-covariate by environment biplot (QTL by environment biplot) Barley Genomics Data

61
Weikai Yan 2006 Trait-association by environment biplot Oat MET Data

62
Weikai Yan 2006 Four-way data analysis Year…

63
Contact: Conclusions

64
Weikai Yan 2006 Conclusion (1) GGE biplot analysis is an effective tool for G by E data analysis to achieve understandings about…. 1.the target environment, 2.the test environments, and 3.the genotypes 4.stability analysis is useful only to a single mega-environment

65
Weikai Yan 2006 Conclusion (2) GGE biplot analysis is an effective tool for G by T data analysis to achieve understandings about…. 1. the interconnected plant system, 2. positively correlated traits 3. negatively correlated traits 4. the strength and weakness of the genotypes

66
Weikai Yan 2006 Conclusion (3) Biplot analysis is an effective tool for other two-way table analysis –Marker by environment –QTL by environment –Gene by treatment –Diallel cross –…

67
Weikai Yan 2006 Conclusion (4) Biplot analysis can be VERY EASY… –From reading data to displaying the biplot: 2 seconds –Displaying any of the perspectives of a biplot and changing from one to another: 1 second –Displaying the biplot for any subset: 1 second –Learning how to use the software and interpret biplots: 30 minutes –Everything can be just one mouse-click away

68
Contact: Thank you Contact: Weikai Yan: web:

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google