We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byAlly Hibbs
Modified about 1 year ago
© Crown copyright 2007 Cluster analysis of mean sea level pressure fields and multidecadal variability David Fereday, Jeff Knight, Adam Scaife, Chris Folland, Andreas Philipp 13 March 2007
© Crown copyright 2007 Introduction Use cluster analysis to examine circulation variability Are genuine clusters present in MSLP data? Stability of different numbers of clusters Multidecadal variability and links with SST
© Crown copyright 2007 Data EMSLP dataset – daily mean MSLP fields NAE region – 25°N-70°N, 70°W-50°E 5 degree x 5 degree resolution
© Crown copyright 2007 Methods Divide data into two month seasons Seasonally varying climatology removed Apply cluster analysis to fields in each season separately Aim is to characterise daily variability – no low pass filtering applied
© Crown copyright 2007 Cluster algorithm Variant of k-means Specify number of clusters beforehand Each field belongs to one cluster Random initial allocation Minimise within cluster variance by exchanging fields
© Crown copyright 2007 Simulated annealing Aim to avoid local minima k-means Simulated annealing Total Variance Alternative clusters Local minimum Global minimum
© Crown copyright 2007 Are there clusters in MSLP fields? Algorithm produces clusters whether any present or not If clusters are present, there must be a fixed number of them Number of clusters is specified beforehand – how is this number decided?
© Crown copyright 2007 Try to find local minima of total within cluster variance For all but small numbers of clusters, many different alternatives Local minima Global minimum Local minima
© Crown copyright 2007 Pie slices not clusters
© Crown copyright 2007 Cluster centroids don’t match
© Crown copyright 2007 Cluster stability Best estimate of global minimum variance Clusters stable to removal of data?
© Crown copyright 2007 Cluster stability method - schematic Start with full set of data Form clustersGo back to full data setRemove half of the dataForm clustersPair up clusters with originals Count the days that match up
© Crown copyright 2007 Stability measure Repeat analysis 100 times Ratio of days that match to total days Stability change with number of clusters Optimum number?
© Crown copyright 2007 JF cluster stability JF (blue) (red)
© Crown copyright 2007 Cluster conclusions Many local minima - no strong clustering Stability reduced as clusters increase No optimum number of clusters Choice of number of clusters is subjective Clusters are nevertheless useful!
© Crown copyright 2007 Multidecadal variability 10 clusters per season Circulation variability - frequency time series Variability on many different timescales Low pass filter (25 year half power) SST links via regression analysis HadISST from month before MSLP season
© Crown copyright 2007 Multidecadal variability in time series
© Crown copyright 2007 Negative summer NAO July / August – summer NAO / AMO links Positive summer NAO
© Crown copyright 2007 Clusters match JA EOF1 series (black)
© Crown copyright 2007 AMO index / cluster frequencies
© Crown copyright 2007 November / December – links to IPO?
© Crown copyright 2007 Conclusions No genuine clusters, but clusters still useful Clusters relate to EOF time series Reproduce known relationships with SST Many results – hint at new SST links
F-tests continued. Introduction Discuss the problems associated with structural breaks in the data. Examine the Chow test for a structural break. Assess.
Questions From Yesterday Equation 2: r-to-z transform –Equation is correct –Comparable to other p-value estimates (z = r sqrt[n]) ANOVA will not be able.
5th Intensive Course on Soil Micromorphology Naples th - 14th September Image Analysis Lecture 5 Thresholding/Segmentation.
Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds.
Page 1© Crown copyright 2004 ECMWF Forecast Products Users Meeting 15th June 2006.
Time series modelling and statistical trends Marian Scott and Adrian Bowman SEPA, July 2012.
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
Time series modelling Marian Scott SAGES, March 2009.
1 Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION.
Using Trees to Depict a Forest Bin Liu, H.V. Jagadish Department of EECS University of Michigan Ann Arbor, USA Proceedings of Very Large Data Base Endowment.
Cointegration and Error Correction Models. Introduction Assess the importance of stationary variables when running OLS regressions. Describe the Dickey-Fuller.
Multiple-choice question. Solution A. No, the F distribution has TWO parameters. B. The mean and variance are NOT the parameters of the F distribution.
Beyond Linear Separability. Limitations of Perceptron Only linear separations Only converges for linearly separable data One Solution (SVM’s) Map data.
Expression Profile = the pattern of signal values for one gene over several chips. Expression Profile Clustering = the clustering of similar profiles Why?
Clustering II. Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like.
© Negnevitsky, Pearson Education, Lecture 9 Evolutionary Computation: Genetic algorithms Introduction, or can evolution be intelligent? Introduction,
Of. and a to the in is you that it at be.
Financial Econometrics Introduction to Systems Approach.
High Frequency Words List A Group 1. the of and.
McGraw-Hill/Irwin © The McGraw-Hill Companies 2010 Audit Sampling: An Overview and Application to Tests of Controls Chapter Eight.
Numerical Weather Prediction (Met DA) The Analysis of Satellite Data (lecture 1:Basic Concepts) Tony McNally ECMWF.
Week 2 – PART III POST-HOC TESTS. POST HOC TESTS When we get a significant F test result in an ANOVA test for a main effect of a factor with more than.
Chapter 26: Data Mining (Some slides courtesy of Rich Caruana, Cornell University)
Atlantic Hurricanes and Climate Change Hurricane Katrina, Aug GFDL 18km-grid simulation of Atlantic hurricane activity Tom Knutson Geophysical Fluid.
T-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag Petrovic.
COMP3740 CR32: Knowledge Management and Adaptive Systems Unsupervised ML: Association Rules, Clustering Eric Atwell, School of Computing, University of.
Characteristics of large scale climate indices Nate Mantua University of Washington Aquatic and Fishery Sciences GLOBEC/PICES/ICES ECOFOR Workshop, Friday.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
© 2016 SlidePlayer.com Inc. All rights reserved.