Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Periodograms Bartlett Windows Data Windowing Blackman-Tukey Resources:
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Relevance Feedback Retrieval of Time Series Data Eamonn J. Keogh & Michael J. Pazzani Prepared By/ Fahad Al-jutaily Supervisor/ Dr. Mourad Ykhlef IS531.
DFT/FFT and Wavelets ● Additive Synthesis demonstration (wave addition) ● Standard Definitions ● Computing the DFT and FFT ● Sine and cosine wave multiplication.
CMPUT 466/551 Principal Source: CMU
Mining Time Series.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Newton’s Method Application to LMS Recursive Least Squares Exponentially-Weighted.
P. Venkataraman Mechanical Engineering P. Venkataraman Rochester Institute of Technology DETC2011 –47658 Determining ODE from Noisy Data 31 th CIE, Washington.
Basis Expansion and Regularization Presenter: Hongliang Fei Brian Quanz Brian Quanz Date: July 03, 2008.
20 10 School of Electrical Engineering &Telecommunications UNSW UNSW 2. Laguerre Parameterised Hawkes Process To model spike train data,
Evaluating Search Engine
x – independent variable (input)
1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.
Curve-Fitting Regression
Course AE4-T40 Lecture 5: Control Apllication
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications Jai-Jin Lim and Kang G. Shin Real-Time Computing Laboratory,
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Major Tasks in Data Preprocessing(Ref Chap 3) By Prof. Muhammad Amir Alam.
Adaptive Signal Processing
LECTURE 2 Understanding Relationships Between 2 Numerical Variables
Calibration & Curve Fitting
Time Series Data Analysis - II
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
October 8, 2013Computer Vision Lecture 11: The Hough Transform 1 Fitting Curve Models to Edges Most contours can be well described by combining several.
Machine Vision ENT 273 Image Filters Hema C.R. Lecture 5.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Chapter 3 Data Exploration and Dimension Reduction 1.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Issues with Data Mining
TIME SERIES by H.V.S. DE SILVA DEPARTMENT OF MATHEMATICS
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
CS910: Foundations of Data Analytics Graham Cormode Time Series Analysis.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Spatial Data Analysis Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What is spatial data and their special.
Xiao Liu, Jinjun Chen, Ke Liu, Yun Yang CS3: Centre for Complex Software Systems and Services Swinburne University of Technology, Melbourne, Australia.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Curve-Fitting Regression
MECN 3500 Inter - Bayamon Lecture 3 Numerical Methods for Engineering MECN 3500 Professor: Dr. Omar E. Meza Castillo
MBA.782.ForecastingCAJ Demand Management Qualitative Methods of Forecasting Quantitative Methods of Forecasting Causal Relationship Forecasting Focus.
Mining Time Series.
Applications of Neural Networks in Time-Series Analysis Adam Maus Computer Science Department Mentor: Doctor Sprott Physics Department.
Regression Regression relationship = trend + scatter
HY436: Mobile Computing and Wireless Networks Data sanitization Tutorial: November 7, 2005 Elias Raftopoulos Ploumidis Manolis Prof. Maria Papadopouli.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
CSE5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides.
Grid-based Map Analysis Techniques and Modeling Workshop
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.
October 16, 2014Computer Vision Lecture 12: Image Segmentation II 1 Hough Transform The Hough transform is a very general technique for feature detection.
Autoregressive (AR) Spectral Estimation
1 BABS 502 Moving Averages, Decomposition and Exponential Smoothing Revised March 14, 2010.
VizTree Huyen Dao and Chris Ackermann. Introducing example
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Computacion Inteligente Least-Square Methods for System Identification.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Statistical Forecasting
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 7. Classification and Prediction
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Fast Approximate Query Answering over Sensor Data with Deterministic Error Guarantees Chunbin Lin Joint with Etienne Boursier, Jacque Brito, Yannis Katsis,
Fitting Curve Models to Edges
HISTORICAL AND CURRENT PROJECTIONS
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
EE513 Audio Signals and Systems
Data Pre-processing Lecture Notes for Chapter 2
Presentation transcript:

Time Series Data Analysis - I Yaji Sripada

Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to analyse time series? –Pre-processing –Trend analysis –Pattern analysis

Dept. of Computing Science, University of Aberdeen3 Introduction What are Time Series? –Values of a variable measured at different time points Why time series are important? –Many domains have tons of time series Meteorology – weather simulations predict values of dozens of weather parameters such as temperature and rainfall at hourly intervals Gas turbines carry hundreds of sensors to measure parameters such as fuel intake and rotor temperature every second Neonatal Intensive Care Units (NICU) measure physiological data such as blood pressure and heart rate every second –Time series reveal temporal behaviour of the underlying mechanism that produced the data

Dept. of Computing Science, University of Aberdeen4 Example (Gas Turbine) A time series has sequence of –Values and –Their corresponding timestamps (the time at which the values are true)

Dept. of Computing Science, University of Aberdeen5 Time Series Autocorrelation Autocorrelation is a special property of time series –Each value of a time series is correlated to older values from the same series –This means, data measurements in a time series are not independent –Periodic patterns seen on the gas turbine plot in the previous slide are results of autocorrelation Time series analysis is special because of this temporal dependency among values of a series –A time series exhibits internal structure

Dept. of Computing Science, University of Aberdeen6 Analysis of Time Series Three main steps –Pre-processing –Trend analysis –Pattern analysis Not all applications require all three steps –Knowledge acquisition studies provide the guidance to determine the required steps Preprocessing –Input raw series may be noisy Due to errors in measurement or observation –Data needs to be smoothed to remove noise –Many noise removal techniques – also known as filters such as Moving averages or mean filter Median filter

Dept. of Computing Science, University of Aberdeen7 Example Series TimeX

Dept. of Computing Science, University of Aberdeen8 Rate of change sensitive to noise TimeXRate of change

Dept. of Computing Science, University of Aberdeen9 Mean Filter There are many versions Our version ( weighted average method) –Assume a window time size, T for the filter –dT – difference in time between two successive values –For each value in the series, compute Current smoothed value =((previous smoothed value * T) + (current value*dT))/(T+dT)

Dept. of Computing Science, University of Aberdeen10 Smoothing TimeXSmoothed XRate of change

Dept. of Computing Science, University of Aberdeen11 Median Filter The idea is similar to Mean filter Instead of using mean we use median Note: in our version of the mean we did not compute a simple mean (average) of the selected values We used a weighted average Known to perform better in the presence of outliers

Dept. of Computing Science, University of Aberdeen12 Trend Analysis Trends can be established using –line fitting techniques for linear data –curve fitting techniques for non-linear data Line Fitting techniques for time series more popularly called segmentation techniques Many segmentation algorithms –Sliding window –Top-down –Bottom-up and –Others (genetic algorithms, wavelets, etc) All segmentation algorithms have different flavours of implementation within the main method –We only learn the main method Segmentation in general can be viewed as a search –for a best possible combination of segments –in a space of all the possible segments

Dept. of Computing Science, University of Aberdeen13 Segmentation The curve at the top shows the original time series The next graphic is the piecewise linear representation or segmented version of it Segmented version of the time series is an approximation of the original series In other words, segmentation may involve loss of information in addition to the loss of noise

Dept. of Computing Science, University of Aberdeen14 Error Tolerance Value One important parameter controlling the segmentation process is the error tolerance value It is the amount of error that can be allowed in the segmented representation –Corresponds to the allowed information loss If the value of ETV is zero segmentation returns a segmented representation without any information loss Large enough values of ETV make segmentation to return one segment losing all the information contained in the original signal in the segmentation process Specification of ETV is linked to the distinction of information and noise –In a particular context –For a particular task

Dept. of Computing Science, University of Aberdeen15 Cost Computation All segmentation algorithms need a method to compute the cost of segmentation Several possible techniques: –Simply take maximum error in a segment –Compute the total error in a segment –Compute the least square error

Dept. of Computing Science, University of Aberdeen16 Sliding window segmentation This algorithm is suitable for segmenting time series obtained in real time (streaming time series) Requirements –Develop a method for computing the cost of merging adjacent segments –Select two parameters an appropriate window size and Error tolerance value The method 1.Form a segment with the values of the input series falling in the window 2.Compute the cost of the segment 3.while the cost of the segment is below the error tolerance value Grow the segment by moving the window forward in the series 4.When a segment cannot grow any more store it in the segmented representation and continue at step 1 with a new segment

Dept. of Computing Science, University of Aberdeen17 Bottom–up Segmentation Empirical evaluation studies with all segmentation algorithms suggest that the bottom-up algorithm is the best –Because it provides a globally optimized segmented representation Requirements –Develop a method for computing the cost of merging adjacent segments –Select an appropriate error tolerance value Bottom-up approach to segmentation –Begin by creating n/2 segments joining adjacent points in a n- length time series –Compute the cost of merging adjacent segments –Iteratively merge the lowest cost pair until a stopping criterion is met The stopping criterion is based on error tolerance value

Dept. of Computing Science, University of Aberdeen18 Wind Prediction Data HourWind Speed 06: : : : : : :0018.0

Dept. of Computing Science, University of Aberdeen19 Segmentation of wind prediction data

Dept. of Computing Science, University of Aberdeen20 Pattern Analysis What is a pattern? –A portion of the series that can be identified as a unit rather than as enumeration of all the values in that portion –Some patterns may be periodic – they repeat at regular time intervals (autocorrelation) Users are interested in patterns occurring in time series –E.g. Spikes and oscillations in gas turbine data Mainly two steps –Pattern location –Pattern classification

Dept. of Computing Science, University of Aberdeen21 Pattern classification and Time Scale Most patterns are classified based on the visual shape of the pattern E.g. A step pattern looks like a step When the time scale changes the visual shape of a pattern changes Pattern classification sensitive to the time scale at which visualization is shown Normal time scale Lower time scale

Dept. of Computing Science, University of Aberdeen22 Symbolic Representations of Time Series Latest trend in mining time series –Convert numerical time series into an equivalent symbolic representation Symbolic Aggregate Approximation (SAX) is a well known representation Efficient algorithms available for doing this transformation Once a time series is available in string form –String analysis techniques can be used for analysing time series data baabccbc

Dept. of Computing Science, University of Aberdeen23 Summary Time Series are Ubiquitous! Three main data analysis steps –Pre-processing smoothing –Trend analysis Line fitting –Pattern analysis Location and classification Issues due to time scale