Data Mining Using Eigenpattern Analysis in Simulations and Observed Data Woodblock Print, from “Thirty-Six Views of Mt. Fuji”, by K. Hokusai, ca. 1830.

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

Sta220 - Statistics Mr. Smith Room 310 Class #14.
Assimilating Data into Earthquake Simulations Michael Sachs, J.B. Rundle, D.L. Turcotte University of California, Davis Andrea Donnellan Jet Propulsion.
Random Matrices Hieu D. Nguyen Rowan University Rowan Math Seminar
Earthquake spatial distribution: the correlation dimension (AGU2006 Fall, NG43B-1158) Yan Y. Kagan Department of Earth and Space Sciences, University of.
Evaluating Hypotheses
Regionalized Variables take on values according to spatial location. Given: Where: A “structural” coarse scale forcing or trend A random” Local spatial.
Pentad analysis of summer precipitation variability over the Southern Great Plains and its relationship with the land-surface Alfredo Ruiz–Barradas 1 and.
Edpsy 511 Homework 1: Due 2/6.
CHAPTER 6 Statistical Analysis of Experimental Data
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Basics of regression analysis
Copyright © Cengage Learning. All rights reserved.
The Empirical Model Karen Felzer USGS Pasadena. A low modern/historical seismicity rate has long been recognized in the San Francisco Bay Area Stein 1999.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
Data Assimilation and the Development of the Virtual_California Model Paul B. Rundle Harvey Mudd College, Claremont, CA Presented at the GEM/ACES Workshop,
Wind Regimes of Southern California winter S. Conil 1,2, A. Hall 1 and M. Ghil 1,2 1 Department of Atmospheric and Oceanic Sciences, UCLA, Los Angeles,
Scatterplots, Associations, and Correlation
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
Algorithms for a large sparse nonlinear eigenvalue problem Yusaku Yamamoto Dept. of Computational Science & Engineering Nagoya University.
From the previous discussion on the double slit experiment on electron we found that unlike a particle in classical mechanics we cannot describe the trajectory.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Digital Image Processing, 3rd ed. © 1992–2008 R. C. Gonzalez & R. E. Woods Gonzalez & Woods Matrices and Vectors Objective.
The basic results and prospects of MEE algorithm for the medium-term forecast of earthquakes Alexey Zavyalov Schmidt Institute of Physics of the Earth.
NKS meets the Grid and e-Science NKS2003 Boston June Geoffrey Fox Community Grids Lab, Indiana University
Intraplate Seismicity Finite element modeling. Introduction Spatial patterns (Fig. 1) –Randomly scattered (Australia) –Isolated “seismic zones” (CEUS)
THE ANDERSON LOCALIZATION PROBLEM, THE FERMI - PASTA - ULAM PARADOX AND THE GENERALIZED DIFFUSION APPROACH V.N. Kuzovkov ERAF project Nr. 2010/0272/2DP/ /10/APIA/VIAA/088.
Research opportunities using IRIS and other seismic data resources John Taber, Incorporated Research Institutions for Seismology Michael Wysession, Washington.
Toward urgent forecasting of aftershock hazard: Simultaneous estimation of b-value of the Gutenberg-Richter ’ s law of the magnitude frequency and changing.
Chapter 10: Introducing Probability STAT Connecting Chapter 10 to our Current Knowledge of Statistics Probability theory leads us from data collection.
Eigenpattern Analysis of Geophysical Data Sets Applications to Southern California K. Tiampo, University of Colorado with J.B. Rundle, University of Colorado.
Earthquake forecasting using earthquake catalogs.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Ergodicity in Natural Fault Systems K.F. Tiampo, University of Colorado J.B. Rundle, University of Colorado W. Klein, Boston University J. Sá Martins,
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Towards a Forecast Capability for Earthquake Fault Systems: Integrating NASA Space Geodetic Observations with Numerical Simulations of a Changing Earth.
9. Impact of Time Sale on Ω When all EMs are completely uncorrelated, When all EMs produce the exact same time series, Predictability of Ensemble Weather.
Simulated and Observed Atmospheric Circulation Patterns Associated with Extreme Temperature Days over North America Paul C. Loikith California Institute.
1 Chapter 4 Numerical Methods for Describing Data.
Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University 1/45 GEOSTATISTICS INTRODUCTION.
1 LES of Turbulent Flows: Lecture 2 (ME EN ) Prof. Rob Stoll Department of Mechanical Engineering University of Utah Spring 2011.
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
Spatiotemporal Networks in Addressable Excitable Media International Workshop on Bio-Inspired Complex Networks in Science and Technology Max Planck Institute.
SPICE Research and Training Workshop III, July 22-28, Kinsale, Ireland Seismic wave Propagation and Imaging in Complex media: a European.
Principal Components Analysis ( PCA)
Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
Statistics 7 Scatterplots, Association, and Correlation.
Alexandra Moshou, Panayotis Papadimitriou and Kostas Makropoulos MOMENT TENSOR DETERMINATION USING A NEW WAVEFORM INVERSION TECHNIQUE Department of Geophysics.
A new prior distribution of a Bayesian forecast model for small repeating earthquakes in the subduction zone along the Japan Trench Masami Okada (MRI,
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
Descriptive Statistics
Linear Algebra Review.
Dynamics of ENSO Complexity and Sensitivity
With special thanks to Prof. V. Moron (U
Southern California Earthquake Center
Xaq Pitkow, Dora E. Angelaki  Neuron 
EE513 Audio Signals and Systems
Trajectory Encoding in the Hippocampus and Entorhinal Cortex
Dynamic Processes Shape Spatiotemporal Properties of Retinal Waves
CA3 Retrieves Coherent Representations from Degraded Input: Direct Evidence for CA3 Pattern Completion and Dentate Gyrus Pattern Separation  Joshua P.
Volume 88, Issue 3, Pages (November 2015)
Collins Assisi, Mark Stopfer, Maxim Bazhenov  Neuron 
by Asaf Inbal, Jean Paul Ampuero, and Robert W. Clayton
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

Data Mining Using Eigenpattern Analysis in Simulations and Observed Data Woodblock Print, from “Thirty-Six Views of Mt. Fuji”, by K. Hokusai, ca John B. Rundle Department of Physics and Colorado Center for Chaos & Complexity University of Colorado, Boulder, CO Presented at the GEM/ACES Workshop Maui, HI July 30, 2001

Activity Correlation Operators Let y(x i,t) be the number of earthquakes per unit time at location x i and time t. Now center the time series (remove mean and standard deviation) y(x i,t)  z(x i,t) … where z(x i,t) is the centered time series. Define two correlation operators, a static correlation operator C(x i,x j ) and a rate correlation operator K(x i,x j ): C(x i,x j ) =  z(x i,t) z(x j,t) dtStatic K(x i,x j ) = (2  ) 2  {  z(x i,t)/  t} {  z(x j,t)/  t} dtRate

Diagonalize the Correlation Operators C(x i,x j ) and K(x i,x j ) are both symmetric, square, and postive definite matrix operators. We can therefore apply singular value decomposition to find the eigenvectors and eigenvalues: C(x i,x j ) =   2  T K(x i,x j ) =   2  T where T denotes the transpose.  is a matrix of static eigenpatterns  n (x i )  is a diagonal matrix of eigenprobabilities i 2  is a matrix of rate eigenpatterns  n (x i )  is a diagonal matrix of eigenfrequencies  i 2

Comparison of Eigenpatterns 1,2 for   0 (Top) with Eigenpatterns 1,2 for  = 0 (Bottom) Positively correlated: (red - red) & (blue - blue). Negatively correlated: (red - blue). Uncorrelated: (red - green) & (blue - green). JBR et al, Phys. Rev. E., v 61, 2000, & AGU Monograph “GeoComplexity & the Physics of Earthquakes”

Patterns of Earthquakes in Southern California Earthquakes in southern California have been systematically recorded since The rate at which these events occur can be used to define activity time series in 10 km x 10 km spatial boxes that can be used to find the spatial patterns. Above is a map of the relative intensity of seismic activity in southern California, This can be considered to be a seismic “hazard map”. Below is a map of the first PCA mode, which we call the “Hazard Mode”. Red areas tend to be active or inactive at the same time. Above is a map of the second PCA mode, which we call the “Landers Mode”. Red areas tend to be inactive when blue areas are active & vice versa. All sites in a blue or red area tend to be active (or inactive) at the same time. Figures courtesy KF Tiampo

Comparison of Log Likelihoods for PDPC from 500 random catalogs of seismic activity in Southern California with occurrence of future events (M > 5) with Log Likelihoods of hazard map & actual catalog via PDPC. Actual catalog: PDPC for 1978-Dec 31, 1991 Histogram: Log Likelihoods for 500 random catalogs. RSV: Use hazard map as predictor. Actual PDPC: Plot at left Example of a PDPC arising from a catalog that has been randomized in space and time.

Using this new technique, one can compute the Phase Dynamical Probability Change (PDPC) anomalies that develop during the years Our retrospective studies indicate that colored anomalies can be regarded as indicating high probability for current and future major earthquakes (M > 6) over the period ~ , and have considerable forecast skill. Earthquake Forecasting via the Mathematics of Quantum Mechanics Pattern techniques suggest a new approach to forecasting earthquakes. The idea is to view the patterns in the context of PHASE DYNAMICAL SYSTEMS, whose mathematics can be mapped into the mathematics of QUANTUM MECHANICS. See JB Rundle et al. (2000); KF Tiampo et al. (2000) In the PDPC method, intensity of seismic activity is mapped to a “wave function  (x,t)”. Intensity of seismic activity,

One way to test the forecast for events from is to plot all events with M > 4.0 that have occurred since Jan 1, 2000, superposed upon the colored forecast anomalies. These events are the small circles at right. Note that our method should really only forecast events with M > 6.0 Testing the Forecast

Space-Time Patterns in Complex Multi-Scale Earthquake Fault Systems Since much of the dynamics is not accessible to direct observations, we must focus on learning about the system through analysis of the observable patterns Space-time patterns in the system are mathematical expressions of the strong statistical correlations between various parts of the system The system state vector characterizes the current state of the system -- it has an amplitude and a phase angle

Mapping Earthquake Dynamics into the Mathematics of Quantum Mechanics (or “Phase Dynamics”) (JB Rundle et al., Phys. Rev. E, v61, 2416, 2000) This new technique can be regarded as a novel datamining method Quantum Mechanical systems are strongly correlated systems (QM is a nonlocal theory) The mathematics of QM describe systems with periodic and quasiperiodic observables, as well as hidden variables Relative probabilities are well-defined quantities in QM Normalized system state vectors are actually “WAVE FUNCTIONS” that describe earthquake probability amplitudes

Using our technique, we can compute the PDPC anomalies that develop during the years Our retrospective studies indicate that these anomalies can be regarded as forecasts for major earthquakes (M > 6) over the period ~ An Earthquake Forecast ?

Earthquake Fault System Dynamics are Strongly Correlated in Space and Time and Lead to Patterns Data from Last Tuesday PDPC Forecast for ~

Summary & Future Directions The methods described here can be used to understand many classes of driven threshold systems Network dynamics are determined importantly by the network connectivity as well as the details of the nonlinear threshold process Meanfield threshold systems appear to have locally ergodic behavior Space-time patterns of observable failures (earthquakes) can be used to understand many facets of the underlying, unobservable dynamics (physical state variables)

Boolean Correlation Operators and Space-Time Patterns We can define a set of basis patterns of earthquake activity using Boolean correlation operators. To do so, we need to define a Boolean activity time series: y(x i,t) As a first step, we coarse grain the domain in space and time…i.e., we divide the region of interest up into N boxes (say, ~10 km on a side) and time into a series of Q short intervals (say, 8 hours). If an earthquake occurs in a spatial box centered at (x i,t), we give a value y(x i,t) = 1 ; y(x i,t) = 0Otherwise. We therefore have a set of N time series, all Q elements long: y(x i,t) = 0,0,0,1,0,0,0,0,0,0,1,0,0,0,0… etc.

Boolean Activity Eigenpatterns from Simulations Here we show Static or Activity Eigenpatterns from the simulation…these constitute one possible basis set for all possible space-time patterns displayed by the system Key to Correlation Patterns: Red sites are positively correlated with red (and blue with blue) Red sites are negatively correlated with blue Red sites & Blue sites are uncorrelated with green The Activity Eigenpatterns are RELATIVE PROBABILITY AMPLITUDES. ( JBR et al, Phys. Rev. E., v 61, 2000, & AGU Monograph “GeoComplexity & the Physics of Earthquakes” )

Summary & Future Directions Numerical simulations (“Third Leg of Science”) are now being used to understand many classes of driven threshold systems (systems with many scales of length and time) Network dynamics of these complex systems are determined importantly by the network connectivity as well as the details of the nonlinear threshold process Meanfield threshold systems have dynamics that demonstrate first and second order (phase) transitions. Threshold systems are capable of universal computation such as that which occurs in the human brain Space-time patterns of observable failures (earthquakes) can be used to understand many facets of the underlying, unobservable dynamics (physical state variables)