Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discovery of Hidden Structure in High-Dimensional Data

Similar presentations


Presentation on theme: "Discovery of Hidden Structure in High-Dimensional Data"— Presentation transcript:

1 Discovery of Hidden Structure in High-Dimensional Data
Aapo Hyvärinen Senior Research Scientist University of Helsinki Helsinki University of Technology

2 Discovery of Hidden Structure in High-Dimensional Data
Science produces enormous data sets, often with hidden structure This theme: Typically continuous data and/or probabilistic models Tasks Parsimonious models Decomposition into dependent components Non-Gaussian Bayesian networks Spatiotemporal models Applications Neuroinformatics: imaging data analysis, functional modelling Bioinformatics: Genome structure, metabolic models, gene regulation Telecom, linguistics, forestry, ecology, atmospheric data, etc. Main teams Hyvärinen & Mannila

3 Highlight/Background: Independent Component Analysis
Linear decomposition of multivariate data Finds hidden directions (green), in contrast to classic PCA (red) Application in blind source separation, e.g. in brain imaging data FastICA: the most popular ICA algorithm (Hyvärinen. IEEE Trans. NN, 1999) Standard reference book on the theory: Independent Component Analysis. Hyvärinen, Karhunen, Oja. Wiley, 2001. Mixing process

4 Task: Decomposition into Dependent Components
Components are not independent in general Extend to dependent components (Hyvärinen team) Grouping and visual ordering (Hyvärinen et al. Neural Comput & 2001) Separation with some dependency (Hyvärinen and Hurri. Signal Proc., 2004) Future: More general dependency structures Related to nonlinear decompositions Extend to binary data (teams Mannila, Toivonen) Analyze stability of components (Himberg et al, NeuroImage, 2004)

5 Highlight and Task: Non-Gaussian Bayesian Networks
Non-gaussianity enables learning network structure and weights in basic linear DAG case (Shimizu, Hoyer, Hyvärinen, Kerminen. J. Mach. Learn. Res., 2006) Another example of the power of non-gaussianity Enables inference on the direction of causality Currently extending to, e.g.: hidden confounding variables (Hyvärinen team) nonlinearities (Hyvärinen team) (partly) binary data (Mannila & Hyvärinen teams) Applications to gene networks, brain imaging data to be explored

6 Future Vision Probabilistic methods with emphasis on algorithmic aspects Interface between computer science and statistics Combine expertise on algorithms and multivariate statistics Discovery of hidden components, clusters, or connections Continuous data: nongaussianity a nonclassic yet central tool Discrete data: e.g., covering approaches Applications in many different fields of science Our special competence in neuro- and bioinformatics


Download ppt "Discovery of Hidden Structure in High-Dimensional Data"

Similar presentations


Ads by Google