Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV.

tom.h.wilson wilson@geo.wvu.edu Department of Geology and Geography West Virginia University Morgantown, WV

Some final comments on principal components analysis and factor analysis 2 dimensions13 dimensions

The results of all this data acquisition reveal something which, fortunately appears to be related to the problem we are trying to solve. How can we differentiate between productive and non-productive mining areas?

Given the elemental concentrations below for observation 43, we compute its principal component scores using the coefficients listed at lower left. Computing the principal component scores- PC1= L Ti *Con Ti +L Mn *Con Mn + …….. +L Au *Con Au Where L represents the loading, coefficient, or weighting factor for the elemental concentrations Con TI, Con Mn, etc. PC1 = 0.381*12200+0.271*5200+0.126*1.5+ ….+0.180*0.01 PC1(43) =2821.5 PC1 is the x location on the above plot, and PC2 is the y location. PC2 would be calculated using the appropriate coefficients for the 2 nd principal component. PC2 = -0.147*12200-0.237*5200+0.236*1.5+ ….+0.007*0.01 PC2(43) = -456.1

Variable PC1 PC2 TI 0.381 -0.147 MN 0.271 -0.237 Ag 0.126 0.236 Ba 0.384 0.070 Co 0.294 -0.371 Cr 0.271 0.452 Cu -0.247 0.209 Ni 0.310 0.313 Pb 0.358 0.228 Sr -0.314 -0.081 V 0.107 -0.479 Zn 0.174 -0.299 Au 0.180 0.007 What would we get if we ran principal components analysis using concentrations for only 5 elements with the highest loadings?

Productive Non-productive

Productive Non-productive We are able to obtain similar groupings of our data into productive and non-productive areas. We also obtain similar associations for our unknowns. We might be able to save money by limiting our assay to Ti, Ba, Ni, Pb, & Sr.

A “noise free” data set and its autocorrelation - This simulated data set is comprised of two periodic components. The presence of the two components is easily seen in either the raw data or its autocorrelation.

In the presence of other influences (measurement error or a process influenced by many variables but controlled by only a few as in our multivariate analysis) our data may not be so easily interpretable. The autocorrelation helps clean it up and reveal the presence of dominant cyclical components.

The amplitudes of the different frequency components are represented in the upper plot. The relative phase shifts imposed on the set of cosine waves are defined by the second plot from the top. We noted that time and spatial views of our data can actually be constructed from a sum of cosines and/or sine waves (in time or space)

The data you are looking at can go from the simple to complex, but it can usually be broken down into a series individual spectral components.

Even when our data have abrupt changes in value, it is still possible to replicate these details using a sum of sines and cosines. A data set depicting the amplitude and frequency of the different sines and cosines used to create the temporal or spatial features in your data is referred to as the amplitude spectrum.

Given the more complicated data sets like the ones we were analyzing before, the autocorrelation and cross correlation give us some idea of the frequency or wavelength of imbedded cyclical components. We would guess that the amplitude spectrum should reveal certain prominent frequencies.

We also examined oxygen isotope data from the Caribbean and Mediterranean using autocorrelation and cross correlation methods and found indications of pronounced cyclical variation through time.

125,000 ? The autocorrelation and amplitude spectrum of the Caribbean Sea  O 18 variations.

Three components representing an ideal model of the “Milankovich” cycles. The real world is not that simple. The superposition of all influences over a 500,000 year period of time. 100,000 years 41,000 years 21,000 years

Variations in orbital parameters computed over 5 million and 1 million year time frames.

Summation of these responses over the past 800,000 years yields a complicated function that might be viewed as controlling earth climate.

The composite response calculated over the past 5 million years and it’s amplitude spectrum. The astronomical components show up as separate peaks in the amplitude spectrum, and the outcome is a little more complicated than the simple 3 component forcing model.

Anyone recall what the Nyquist frequency is? Recall, this frequency is related to the sampling interval. What is the maximum frequency you can see when sampling at a given sample rate  t? f Ny =1/2  t

In today’s lab exercise you’ll simulate noisy climate data containing “hidden” Melankovich cycles and then compute its amplitude spectrum.

Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV.

Similar presentations

Presentation on theme: "Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV.

Similar presentations

Presentation on theme: "Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV."— Presentation transcript:

Similar presentations

About project

Feedback