Migration Motif: A Spatial-Temporal Pattern Mining Approach for Financial Markets Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, John H.Thornton Jr.

Slides:



Advertisements
Similar presentations
1 Analyzing Kleinberg’s Small-world Model Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.
Advertisements

Mining Compressed Frequent- Pattern Sets Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Frequent Closed Pattern Search By Row and Feature Enumeration
Theoretical Analysis. Objective Our algorithm use some kind of hashing technique, called random projection. In this slide, we will show that if a user.
Exploiting Sparse Markov and Covariance Structure in Multiresolution Models Presenter: Zhe Chen ECE / CMR Tennessee Technological University October 22,
Determinants and Dynamics of Dividend Payouts by REITs by Milena Petrova, Syracuse University Andrew Spieler, Hofstra University.
Introduction to Bioinformatics
(5) ROSENGARTEN CORPORATION Pro forma balance sheet after 25% sales increase ($)(Δ,$)($)(Δ,$) AssetsLiabilities and Owner's Equity Current assetsCurrent.
SOME LESSONS FROM CAPITAL MARKET HISTORY Chapter 12 1.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
HCS Clustering Algorithm
Structural Knowledge Discovery Used to Analyze Earthquake Activity Jesus A. Gonzalez Lawrence B. Holder Diane J. Cook.
Rajesh Shekhar Data Mining Prof. Chris Volinsky. ◦ Use Data Mining techniques to build a portfolio with superior return/risk characteristics using technical.
Birch: An efficient data clustering method for very large databases
ERES 2011 The Performance Gap in UK Property Returns Stephen Lee
Pro forma balance sheet after 25% sales increase
Efficient Capital Markets Objectives: What is meant by the concept that capital markets are efficient? Why should capital markets be efficient? What are.
Financial Risk Management of Insurance Enterprises
Guiding Motif Discovery by Iterative Pattern Refinement Zhiping Wang, Mehmet Dalkilic, Sun Kim School of Informatics, Indiana University.
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, 2.
Graph and Topological Structure Mining on Scientific Articles Fan Wang, Ruoming Jin, Gagan Agrawal and Helen Piontkivska The Ohio State University The.
Non Negative Matrix Factorization
Lionel F. Lovett, II Jackson State University Research Alliance in Math and Science Computer Science and Mathematics Division Mentors: George Ostrouchov.
LANGUAGE NETWORKS THE SMALL WORLD OF HUMAN LANGUAGE Akilan Velmurugan Computer Networks – CS 790G.
What is a Stock Market?. Where do you go to buy CDs, jeans and books? –Just like a market for CDs, jeans and books, there is a market for stocks People.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities Date : 2012/8/6 Resource : WSDM’12 Advisor.
CHAPTER SEVEN PORTFOLIO ANALYSIS. THE EFFICIENT SET THEOREM THE THEOREM An investor will choose his optimal portfolio from the set of portfolios that.
Victor Lee.  What are Social Networks?  Role and Position Analysis  Equivalence Models for Roles  Block Modelling.
Automated Social Hierarchy Detection through Network Analysis (SNAKDD07) Ryan Rowe, Germ´an Creamer, Shlomo Hershkop, Salvatore J Stolfo 1 Advisor:
FNCE August Function of Financial Markets.
Size Effect Matthew Boyce Huibin Hu Rajesh Raghunathan Lina Yang.
A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.
STRATEGIC FINANCIAL MANAGEMENT Hurdle Rate: The Basics of Risk II KHURAM RAZA.
SPLASH: Structural Pattern Localization Analysis by Sequential Histograms A. Califano, IBM TJ Watson Presented by Tao Tao April 14 th, 2004.
FYP Briefing. Project #56 Is R&D a Priced Factor in APT Market Model for Capital Asset Pricing Model (CAPM) –R i = a i + b i R mt + ε i Where Ri = expected.
Friends and Locations Recommendation with the use of LBSN By EKUNDAYO OLUFEMI ADEOLA
Shape-based Similarity Query for Trajectory of Mobile Object NTT Communication Science Laboratories, NTT Corporation, JAPAN. Yutaka Yanagisawa Jun-ichi.
David Kilgour Lecture 4 1 Lecture 4 CAPM & Options Contemporary Issues in Corporate Finance.
Chapter 7 Probability and Samples: The Distribution of Sample Means
CURE: An Efficient Clustering Algorithm for Large Databases Sudipto Guha, Rajeev Rastogi, Kyuseok Shim Stanford University Bell Laboratories Bell Laboratories.
So, what’s the “point” to all of this?….
University of Macau Discovering Longest-lasting Correlation in Sequence Databases Yuhong Li Department of Computer and Information Science.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying 1, Wang-Chien Lee 2, Tz-Chiao Weng 1 and Vincent S. Tseng 1 1 Department of Computer.
Investing in Equities, Futures and Options:.  The Efficient Market Hypothesis states that it is not possible to “beat the market” regularly.  investors.
High Momentum and Traditional Momentum Strategies: Evidence from China Traditional Momentum (Jegadeesh and Titman, 1993)  A self-financing strategy that.
Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 28 Nov 9, 2005 Nanjing University of Science & Technology.
Informatics tools in network science
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
Parameter Reduction for Density-based Clustering on Large Data Sets Elizabeth Wang.
Mining of Massive Datasets Edited based on Leskovec’s from
Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.
1 Detecting Spatially- Close Fiber Segments in Optical Networks Farabi Iqbal Stojan Trajanovski Fernando Kuipers (Delft University of Technology) 16 March.
Differential Analysis on Deep Web Data Sources Tantan Liu, Fan Wang, Jiedan Zhu, Gagan Agrawal December.
Book web site:
Cohesive Subgraph Computation over Large Graphs
SECURITY MARKET INDICATORS
G10 Anuj Karpatne Vijay Borra
An Investigation of Market Dynamics and Wealth Distributions
Characterizing Stock Market Behavior Correlations in Complex Networks
Lin Lu, Margaret Dunham, and Yu Meng
SEG 4630 E-Commerce Data Mining — Final Review —
Data Science introduction.
Department of Computer Science University of York
CSE572, CBS572: Data Mining by H. Liu
Handwritten Characters Recognition Based on an HMM Model
Discovery of Significant Usage Patterns from Clickstream Data
CSE572: Data Mining by H. Liu
Presentation transcript:

Migration Motif: A Spatial-Temporal Pattern Mining Approach for Financial Markets Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, John H.Thornton Jr Presented by: Xiaoxi Du Department of Computer Science Kent State University

Do we yet fully understand financial market risks? To describe frequent behaviors of individual companies To describe the relationships between stock market change over time and stock return

Example: Trajectories on a Financial Grid Financial Grid SIZE market captalization = (share price×number of shares) P/B Price-to-book ratio = (Current price per share / book value per share) Company Trajectory Compact Trajectory

T2 T1 10 Spatial and Temporal Constraint SIZESIZE P/B Spatial Constraint: To guaranteed to follow a bounded path U Temporal Constraint: An upper bound time constraint (short-term) ε

Migration Motif A migration motif (pattern) corresponds to a collection of sub-trajectories which follow similar path. properties:  pair-wise similarity: distance ≤ ε  Maximal: add one other sub-trajectory violate pair- wise similarity  Frequent: sub-trajectories → at least θ different trajectories

Algorithm Goal: To Extract Migration Motifs efficiently Trajectories (company) 2-Length Sub-Trajectories Similarity Graph Frequent 2-Length Migration Motif Frequent K-Length Migration Motif Apriori Property Compact Trajectory Pattern representation Graph theoretical Maximal Clique

Characteristics of the Datasets Data Source The Center for Research in Security Prices (CRSP) and Compustat Databases Time Period1964 to 2007 ParametersTemporal Constraint U = {3,4,5} Spatial Constraintε = {0,1,2} Minimum Support Level θ = {10,15,20} Grid Dimensionsg = {10×10, 20x20, 50x50, 100x100} Stock Exchanges and Description NYSE1717 (relatively large) NASDAQ2675 (smaller) AMEX825 (mostly smaller)

Motif Sensitivity to Parameters NYSE Motifs: (10g/U3/ε1/θ10)NYSE Motifs: (20g/U3/ε1/θ10) Result: NYSE

Motif Sensitivity to Parameters NASDAQ Motifs: (50g/U3/ε1/θ10) Result: NASDAQ

The randomized data contains many 2-length motif (M 2 ), Statistical Significance of Motifs However, random motifs longer than 2 are quite rare Risk factor migration in the stock market is not random, And should not be neglected

Oscillation Motif Patterns NYSE Motifs: (10g/U3/ε1/θ10) Value oscillation (horizontal) size oscillation (vertical)

Distribution of Motifs NYSE Motifs: (10g/U3/ε1/θ10) NASDAQ Motifs: (50g/U3/ε1/θ10)

Motif Timing - Average Starting Time - the point at which its migration pattern is first captured by a motif - Maturity - Average Staying Time - Long term vs Short term - Loser and Winners Portfolios

Motif Company Time Span - To list Membership information for typical motifs. -To provide each company’s ticker and time span - M5-45 time spans are highly concentrated for value oscillation path - M6-1 significant jumps - M4-50 no clear clustering of starting years for vertical oscillation path

Conclusion We introduce two new algorithms to discover migration motifs in the financial grid Our work is the first attempt to find multi-year migration patterns in financial datasets We are the first to find long oscillation patterns in P/B value