GG 313 beginning Chapter 5 Sequences and Time Series Analysis Sequences and Markov Chains Lecture 21 Nov. 12, 2005.

Slides:



Advertisements
Similar presentations
CHI-SQUARE(X2) DISTRIBUTION
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
2013/12/10.  The Kendall’s tau correlation is another non- parametric correlation coefficient  Let x 1, …, x n be a sample for random variable x and.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Basic Statistics The Chi Square Test of Independence.
Hypothesis Testing IV Chi Square.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #17.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #19 3/8/02 Taguchi’s Orthogonal Arrays.
PSY 307 – Statistics for the Behavioral Sciences
GG313 Lecture 8 9/15/05 Parametric Tests. Cruise Meeting 1:30 PM tomorrow, POST 703 Surf’s Up “Peak Oil and the Future of Civilization” 12:30 PM tomorrow.
Statistics.
THE CONCEPT OF STATISTICAL SIGNIFICANCE:
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
© 2004 Prentice-Hall, Inc.Chap 12-1 Basic Business Statistics (9 th Edition) Chapter 12 Tests for Two or More Samples with Categorical Data.
Chapter 13: Inference in Regression
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 13: Nominal Variables: The Chi-Square and Binomial Distributions.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
GG 313 Lecture 22 Series of Events Run Tests Correlation Nov 10, 2005.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Copyright © 2010 Pearson Education, Inc. Slide
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
GG 313 Lecture 9 Nonparametric Tests 9/22/05. If we cannot assume that our data are at least approximately normally distributed - because there are a.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Week 13a Making Inferences, Part III t and chi-square tests.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Stochastic Processes and Transition Probabilities D Nagesh Kumar, IISc Water Resources Planning and Management: M6L5 Stochastic Optimization.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 4 Investigating the Difference in Scores.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
BPS - 5th Ed. Chapter 231 Inference for Regression.
Copyright © Cengage Learning. All rights reserved. 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Comparing Counts Chi Square Tests Independence.
Chapter 9: Non-parametric Tests
5.1 INTRODUCTORY CHI-SQUARE TEST
Hypothesis Testing Review
Chapter 25 Comparing Counts.
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Discrete Event Simulation - 4
Chapter 10 Analyzing the Association Between Categorical Variables
Overview and Chi-Square
Lesson 11 - R Chapter 11 Review:
Basic Practice of Statistics - 3rd Edition Inference for Regression
Analyzing the Association Between Categorical Variables
Chapter 26 Comparing Counts.
Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.
Chapter 26 Comparing Counts.
Presentation transcript:

GG 313 beginning Chapter 5 Sequences and Time Series Analysis Sequences and Markov Chains Lecture 21 Nov. 12, 2005

Ordered Sequences Many geological problems have ordered sequences for data. Such data have observations whose time of occurrence or location are important - as opposed to sequences where the time or location is not important. Many such sequences occur in pairs, with a time and an observation, such as earthquake occurrences, volcanic eruptions, seismometer data, magnetic data, temperatures, etc. Other data may have an observation tied to a location, such as bathymetric data, and any map information. These data are often called TIME SERIES, whether the independent variable is a location or time.

Analysis of sequential data is aimed at addressing questions such as: 1) Are the data random, or do they contain a pattern or trend? 2) If there is a trend, what form does it have? 3) Are there any periodicities in the data? 4) What can be estimated or predicted from the data? Methods for comparing two or more sequences are broadly grouped into two classes. In the first, the exact location of an event matters. Good examples of this class are X-ray diffraction data and mass spectrometer data. Two peaks at different locations are not related and give us specific information.

The second class compares sequences where absolute location is not important, such as different types of earthquakes and other characteristic events. These sequences are compared by cross-correlation, and we will cover them beginning in the next class.

Some data form sequences that do not fit well into the time series classification, a good example is stratigraphic sequences. It is not a simple matter to relate a particular layer or sequence of layers to time, since compaction and variations in deposition rate do not allow a simple relationship between layer thickness and time. In Figure 5.1, a stratigraphic sequence is shown. If we sample this sequence 62 times, we can record 61 transitions from one type of rock to another.

In this example, we have four different rock types, (A) sandstone, (B) limestone, (C) shale, and (D) coal. We would like to know if there are some transitions that are favored over random transitions. For example, is change from limestone to sandstone more likely than limestone to coal? Since there are four different states, or rock types, there are 16 possible types of transitions: By looking at the change from observation n to observation n+1 for the sequence in Figure 5.1, we can set up a transition frequency matrix:

ABCDRow total A B05207 C D00325 column total transition frequency matrix: Thus, there are 17 cases where A remains A from one measurement to the next, and 5 cases where A changes to C, but no cases where A changes to D.

Is it necessary that there be as many cases of A changing to C as there are C changing to A? Does the matrix need to be symmetric? Paul says it should be. We can more easily quantify the tendency to change from one state to another by changing the numbers in the above matrix to fractions or probabilities:, making a transition probability matrix by dividing each row by its total, such that each row sums to 1.0: From/ To ABCD A B C D The probability of D changing to C is 0.6.

If we divide the row totals in the transition frequency matrix by the total number of transitions, we get the probability of each state, in other words, we get the proportions of each lithology, called the marginal or fixed probability vector: (5.1)

We can describe these properties in a cyclic diagram to show the probabilities involved. A similar diagram could be drawn for the hydrologic cycle and similar phenomena.

Recall from Chapter 2 that the probability of two events (A and B) will occur (the joint probability) equals the probability of B given that A has occurred times the probability of A occurring: (5.2) Which we rearrange to: (5.3) In our example, this is the probability that B occurs after A. If all events are independent, that is the probability distribution is uniform, and no event depends on the previous event, then: (5.4)

If all events are independent, then we can make a matrix showing what the probabilities of each transition are, given the relative abundance of each state. Each row will be the same and the sum of the probabilities in each row will equal 1. The numbers in each row are the fixed probability vector (5.1): ABCD A (SS) B (LS) C (SH) D (CO) These are the EXPECTED transition probabilities given that each transition from lithology to another is independent.

We are now in a position to be able to test whether the observed transitions are random (independent) or not. Our hypothesis is that the transitions are not random, and our null hypothesis is that the transitions occur randomly. We first change our probability matrix above back to frequencies by multiplying by the row totals from the transition frequency matrix: (5.6)

and test with: (5.7) The O i ’s are from the data frequency matrix and the E i ’s are from the predicted frequency matrix (5.6). The degrees of freedom, are (m-1)*(m-1) where m=4 in this example. One degree of freedom is lost from each row and column because “all must add up to 1”. I see where the rows add up to 1, but where do the columns add to 1?  2 is only valid if the expected values are greater than 5, otherwise the error is too large for a valid test. We can get around this by combining some categories to raise the expected values. Since we are only testing for independence, this is OK.

Some transitions are larger than 5 anyhow (A  A, A A  C, C  C, and we combine others to form a total of 7 new categories: And  2 thus is: (5.8)

We haven’t lost any degrees of freedom by combining categories, so =(m-1)*(m=1)=9, and the critical  2 value from the tables is at 95% confidence. Since the critical value is smaller than the observed (5.8), we can reject the null hypothesis that the transitions are independent, thus there is a statistical dependence on the transitions from one lithology to the next. Geologically, this is to be expected, since limestones are most often deep-water depositional environments, and coals are subaerial. Sequences that are partially (statistically) dependent on the previous state are called Markov chains. Sequences that are completely determined by the previous value are called deterministic. For example the sequence: [ ] is deterministic, or fully predictable. We can also have sequences that are completely random.

Markov chains where values depend only on the previous value are said to have first-order Markov properties. Those that depend on the next previous value also have 2nd order Markov properties, etc. We can use the transition probability matrix to predict the likely lithology 2 feet above a point. This might be necessary to fill in a missing point, for example, with the most probable value. For example if we have B (limestone) at some depth, what is the most likely lithology 2 feet (two steps) above? For a single step, B  A (SS) : 0% B  B (LS): 71% B  C (SH): 29% B  A (CO): 0% From/ To ABCD A B C D

So we can only get from B to B or B to C. If the transition is to C, then the probability of the NEXT transition is: C  A (SS): 19% C  B (LS): 7% C  C (SH): 63% C  D (CO): 11% From/ To ABCD A B C D We can see that the the probability of each possibility is: P(B  C)P(C  B)= =2% P(B  B)*P(B  B)=.71.71=50%, so P(B  ?  B)=P (B  B  B)+ P(B  C  B)=50%+2%=52%

This process gets more complex as we ask for predictions higher in the chain, (higher order Markov properties) but there is an easy way. We just square the probability matrix to get the 2nd order Markov properties, and cube it to get 3rd order, etc. IN CLASS: what is the probability of having a shale three feet above a limestone in this example? Embedded Markov chains: The choice of sampling interval is arbitrary and important. If we sample too sparsely, we will likely miss information completely. If we sample too closely, then the diagonal elements of the probability matrix will approach 1 and the off-diagonal elements will approach zero.

What if we only sample at points of “real” transitions, and ignore points where the two states are the same? In this case, the transition frequency matrix will be zeroes along the diagonal: In this example we have five lithologies, A (Ms, mudstone), B (Sh), C (Ss, siltstone), D (Ss), and E (Co).

The fixed probability vector is found by dividing the row totals by the grand total: f=[ ] To test the Markov properties, we would like to do a  2 test, but we cannot use the fixed vector to estimate the transition frequency matrix because the diagonal terms would be non- zero. If we did NOT ignore the repeated states, then the frequency matrix would have identical numbers except along the diagonal. If we raise this matrix to a higher power, then we could discard the diagonal terms, adjust the off-diagonal terms to sum to 1, and get our results. Since we don’t know the number of repeated states, we look for the diagonal terms by trial and error.

We iterate (try over and over) to find these terms as follows: 1)Put arbitrary large estimates (like 1000) into the diagonal positions in the observation matrix. 2)Divide the row totals by the grand total to get the diagonal probability. 3)Calculate new diagonal estimates by multiplying diagonal probabilities from step 2 by the latest row sums. 4)Repeat the process steps 2 and 3 until the diagonal elements remain unchanged, typically about iterations.

For our comparison, we test against independent states, and the probability that state j will follow state i is P(i  j)=P(i)P(j). We construct our expected probability matrix: P c,

We now zero out the diagonal elements and use the off- diagonal counts to calculate the  2 value for our data from eqn (5.7). We get  2 =172, which is much larger than the critical value for =(m-1) 2 -m=11 degrees of freedom, indicating a strong dependence on the transitions - a strong 1st order Markov characteristic.