We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byAlberto Sport
Modified about 1 year ago
© NERC All rights reserved Storms rare but important Balance dataset otherwise storms look like noise Features selected like Split: training set, validation set, test set Training set scaled Some algorithms require use Principal Component Analysis to decompose Improving Operational Geomagnetic Index Forecasting Laurence Billingham Gemma Kelly British Geological Survey, West Mains Road, Edinburgh, UK 1. Introduction The interest in space weather has never been greater, with society becoming ever more reliant upon technology and infrastructure which are potentially at risk. Geomagnetic storms are potentially damaging to power-grids, communication systems and oil and gas operations. 2. Data 3. Techniques Machine Learning A branch of statistics We use regression algorithms here Data laid out as for matrix inversion (little like finding best fit line with 2D data) Many algorithms (see  for an excellent introduction), some are like linear regression e.g. Workflow: Training: get coefficients from Tune model parameters against validation set Test and score model with test set Predict new a p from unseen data ARIMA Auto-regressive moving average A linear regression over a windowed average of a p Only input is a p timeline Currently operational: used here as a baseline quality comparison 5.Summary and Future Work 4.Results Initial dataset with 205 samples (small set) Some models much better at identifying storms than others Large range in rms values and percentage of predictions which are close to the true value We then increased the total dataset size to 1000 samples (large set) and tested the best performing models Again range of rms values All the machine learning models out perform the ARIMA model in terms of rms, HitRate and skill (HSS) Positive results: worth pursuing for production system References  McPherron, Magnetospheric Dynamics, in Introduction to Space Physics, edited by Kivelson, Russell, pp , Cambridge University Press,  Hastie et al., The Elements of Statistical Learning Data Mining, Inference, and Prediction, Springer 2009(II) Geomagnetic indices Capture magnetic storm severity by summarising lots of data have become ubiquitous parameterisations of storm-time magnetic conditions required as inputs by a variety of models a p index captures amplitude of the disturbance in horizontal part of the field (see e.g.  for more detail) tracks disturbances within a 3-hour interval indicates the global level of disturbance Samples times over ~15 years of geomagnetic and solar wind data Same scaling applied to other sets Linear Regression LR + = RidgeLR + = Lasso LR + Lasso + Ridge = ElasticNet Scoping study results positive value in predictions proceed to operational system Here we only predict 1 a p interval into future Some models easily configures to predict multiple intervals Others need new train, validate, test cycles Classification not regression e.g. G1,..., G5 More useful aid to human forecaster Potentially easier computation Up-weight storm categories: balance dataset More features per sample Models converge with few training samples (see fig): models powerful enough Data mine human forecasts, coronagraph data... Science potential in ‘white-box’ models: which features give useful info? This work is powered by Python-Scikit-learn Pedregosa et al., Scikit-learn: Machine Learning in Python, JMLR 12, pp , Metrics: rms: root-mean square error % within ±N: Percentage of predicted values within ±N of the observed value HitRate: how well do we predict the storms? 1 = predicted every single storm 0 = missed every storm HSS: Heidke skill score measures fractional improvement of the forecast over forecast by random chance HSS = 2 (ad – bc) / [(a+ c)(c + d) + (a + b)(b + d)] 1 = highly skilled 0 = no skill <0 = worse than random chance FAR: False alarm rate of storm prediction 0 = no false alarms 1 = all false alarms Event Forecast Storm Observed YesNo Forc Σ Yesaba + b Nocdc + d Obs Σ a + cb + da+b+c+d = n Small set Large set
Chapter 4 Evaluating Classification and Predictive Performance 1.
Chapter 1 Introduction 1. The Evolution of Data Analysis To Support Business Intelligence 2 Evolutionary Step Business Question Enabling Technologies.
Chapter 2 Overview of the Data Mining Process 1. Introduction Data Mining – Predictive analysis Tasks of Classification & Prediction Core of Business.
Lecture 4. Linear Models for Regression. Outline Linear Regression Least Square Solution Subset Least Square subset selection/forward/backward Penalized.
Computational and Statistical Tradeoffs in Inference for Big Data Michael I. Jordan Fondation des Sciences Mathematiques de Paris January 23, 2013.
Chapter 6 Three Simple Classification Methods The Naïve Rule Naïve Bayes k-Nearest Neighbor 1.
Building an Emulator. EGU short course – session 22 Outline Recipe for building an emulator – MUCM toolkit Screening – which simulator inputs matter Design.
Quality Point: A Contemporary Approach to Sales Comparison Presented to the Fine Appraisers of Eastern Ontario on Behalf of the Ontario Association – Appraisal.
Space whether affects on power systems: prediction, risk analysis and modeling ІКД Vitaliy Yatsenko Panos Pardalos Nikita Boiko Steffen Rebbenack Space.
Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2012.
Computer Science CPSC 502 Lecture 15 Machine Learning: Intro and Supervised Classification.
UK Transmission Datamining Methods for Demand Forecasting at National Grid Transco David Esp A presentation to the Royal Statistical Society local meeting.
COMP3740 CR32: Knowledge Management and Adaptive Systems Supervised ML to learn Classifiers: Decision Trees and Classification Rules Eric Atwell, School.
Course round-up subtitle- Statistical model building Marian Scott University of Glasgow Glasgow, Aug 2013.
'Trends, time series and forecasting Paul Fryers East Midlands KIT.
Robin Hogan Ewan OConnor, Anthony Illingworth University of Reading, UK Clouds radar collaboration meeting 17 Nov 09 Ground based evaluation of cloud forecasts.
Robin Hogan Ewan OConnor University of Reading, UK What is the half-life of a cloud forecast?
© Deloitte Consulting, 2004 Predictive Modeling for Property-Casualty Insurance James Guszcza, FCAS, MAAA Peter Wu, FCAS, MAAA SoCal Actuarial Club LAX.
Robin Hogan Ewan OConnor, Anthony Illingworth University of Reading, UK Chris Ferro, Ian Jolliffe, David Stephenson University of Exeter, UK Verifying.
Environmental change and statistical trends – some examples Marian Scott Dept of Statistics, University of Glasgow NERC September 2010.
Simulators and Emulators Tony O’Hagan University of Sheffield.
September 2, 2014Data Mining: Concepts and Techniques1.
Using microarray for identifying regulatory motifs and analysing time series gene expression Outline: Using Bayesian statistics in classifying genes in.
Time Series Analysis Negar Koochakzadeh. Outline Introduction: Time Series Data Stationary / Non-stationary TS Data Existing TSA Models AR (Auto-Regression)
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Page 1© Crown copyright Operational Use of ECMWF products at the Met Office: Current practice, Verification and Ideas for the future Tim Hewson 17 th June.
Richard (Rick) Jones SWFDP Training Workshop on Severe Weather Forecasting Bujumbura, Burundi, Nov 11-16, 2013.
IWTC VI – Costa Rica – November 2006 Summary Session – Jim Davidson Quantitative Forecasts of TC Landfall in relation to an Effective Warning System.
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Deep learning with multiplicative interactions Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of.
© 2016 SlidePlayer.com Inc. All rights reserved.