Download presentation

Presentation is loading. Please wait.

Published byRoman Unsworth Modified about 1 year ago

1
The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria Peter Paul Sint, Austrian Academy of Sciences, Vienna 30 Years

2
Department of Statistics and Informatics, University of Vienna Peter Paul, a „senior“ Assistant Professor

3
3 Department of Statistics and Informatics, University of Vienna 1974A few years after Wilfried, a „junior“ Assistant Professor

4
A few years after Michael, a first year student University of Vienna Gerhard Bruckmann

5
5 Outline of Presentation The Beginning of COMPSTAT Early statistical computing The institutional environment The first symposium and the Compstat Society Developments in Computational Statistics (CS) CS and statistical theory CS and algorithms CS and computer science CS and application The COMPSTAT Symposia

6
6 The Beginning of COMPSTAT

7
7 Early Computational Statistics The Beginnings in Vienna –Institute of Statistics Part of the Law Faculty - S. Sagoroff - Leipzig/Sofia/USA/Berlin//Vienna - Energy Balances first Computer: first generation machine –Paid for by Rockefeller-Foundation 1960 –Arrival of the ‚Electronic Brain‘ 1st generation »Never again similar enthusiasm Institute of Advanced Studies - Ford Institute –Statistical machines - card counting - >2nd generation Replaced by IBM / rd gen. SSP / SPSS –Computing Center

8
8 Statistics-Computational One year Biostatistics department Oxford University Still: Not strongly integrated in international statistical community - Main contacts ISI: Central Statistical Office, Sagoroff 1973 ISI-session in Vienna - emphasis on applications - computational methods rare Bring statisticians with our interests to Vienna Encouragement by publisher Arnulf Liebing /Physica/ What is specific to our department? Concept of Computational Statistics - Johannes Gordesch (Math) - Peter Paul Sint (Physics)

9
9 First COMPSTAT Call COMPSTAT Gerhart Bruckmann - Local fame as analyst of voting results during election nights -Leopold Schmetterer (successor of Sagoroff) - Internationally known Mathematical Statistician (Franz Ferschl, incoming professor of statistics, new editor of Metrika - added as an editor by the publisher)

10
10 S. Sagoroff and M. Tantilov

11
11 First COMPSTAT Editors

12
12 Preface of the first Proceedings

13
13 Logic of the Logo

14
14 J. Gordesch at Compstat76 Berlin

15
15 Getting of Age International from the start Compstat Society since Berlin Leiden NL 1978 Integration into IASC Edinburgh GB Toulouse F 1982 Eastern Europe needed Politics ISI-IASC Local Projects redirected: Prague 1984 Rome I Copenhagen 1988 DK Dubrovnik YU Neuchâtel CH 1992

16
16 Prague 1984

17
17 Developments in Computational Statistics

18
18 Computational Statistics What is Computational Statistics? –A question raised many times at the end of the 80ies and beginning of the 90ies inside the community

19
19 Computational Statistics Working definition (A. Westlake) Computational Statistics is related to the advance of statistical theory and methods through the use of computational methods. This includes both the use of computation to explore the impact of theories and methods, and development of algorithms to make these ideas available to users

20
20 Computational Statistics Computational Statistics Statistical Theory Algorithms Applications Computer Science Numerical Analysis Statistical Software Modelling Seminumerical Algorithms

21
21 Computational Statistics and Statistical Theory The statistical journey in the 20th century The Theory Era The Methodology Era

22
22 Computational Statistics and Statistical Theory The statistical journey in the 20th century –B. Efron: Statistics in the 20th century is a journey between three poles: Applications Mathematics Computation

23
23 Computational Statistics and Statistical Theory The Theory Era (Pearson, Neyman, Fisher, Wald) –From models for solving practical problems towards a mathematical decision theoretic framework –Based on optimality principles –Application is based on computations feasible for paper and pencil or mechanical computing devices

24
24 Computational Statistics and Statistical Theory Modelling Era (1) –Tukey’s paper about the future of data analysis (1962) as a turning point from mathematics towards computation Confirmatory versus explanatory analysis Dynamics of data analysis “Robustness” Importance of Graphics

25
25 Computational Statistics and Statistical Theory Modelling Era (2) –Important developments in the modelling era Nonparametric and Robust Methods Kaplan-Meier and Proportional Hazards Logistic Regression and GLM Jackknife and Bootstrap EM and MCMC Empirical Bayes and James-Stein Estimation

26
26 Computational Statistics and Statistical Theory Modelling Era (3) –The modelling area is characterized by a strong interplay between statistical theory and computational statistics –The computer as a workbench for statistical experiments (going back to v. Neumann and S. Ulam) Passive usage: Studying feasibility of statistical theory by simulation Active usage: Obtain results which cannot be computed by conventional numerical algorithms

27
27 Computational Statistics and Statistical Theory COMPSTAT was probably not always at the frontier of this developments but the programs and the proceedings reflect quite well the dynamics of the subject in the Modelling Era

28
28 Computational Statistics and Algorithms Numerical Algorithms –Matrix Computation, Optimization Random Numbers / Monte Carlo Semi-numerical Algorithms –Sorting, Searching, Combinatorial Methods, Graph Theoretic Algorithms,… Graphical Algorithms Symbolic Computation (?) Mathematical vs. Statistical Modelling

29
29 Computational Statistics and Algorithms Statistics and Numerical Algorithms (1) –Fast Fourier Transform (Tukey) –Recursive Algorithms and Filtering (Kalman Filter) (Both topics seem to be not core topics in computational statistics)

30
30 Computational Statistics and Algorithms Statistics in Numerical Algorithms (2) –Adaptation of optimization techniques (e.g. scoring methods) –Behaviour of optimization methods in statistical context (numerical convergence vs. stochastic convergence concepts) Implicit Consideration at COMPSTAT

31
31 Computational Statistics and Algorithms Statistics and Random Numbers / Monte Carlo –Generation of Random numbers was (and is) probably more a topic of mathematics (number theory) and computer science In the beginning of COMPSTAT there was also some connection to simulation –Genuine application of Monte Carlo Methods in connection with new developments of statistical theory (e.g. MCMC)

32
32 Computational Statistics and Algorithms Statistics and semi-numerical algorithms –Applications in context of nonparametric statistics and analysis of tabular data Feasibility of conditional inference for logistic models –New developments on the borderline between statistics and computer science Data Mining as a new statistical modelling paradigm COMPSTAT was open towards these developments and integrated it into the program

33
33 Computational Statistics and Algorithms Statistics and Graphical Algorithms –Development rather complementary to the developments of computer science, –Important issues (L. Wilkinson): Graphics are not only a tool for displaying results but rather a tool for perceiving relationships Dynamic graphics as important tool for data analysis Graphics are a means of model formalization reflecting quantitative and qualitative traits of its variables Represented quite well at COMPSTAT

34
34 Computational Statistics and Algorithms Mathematical vs. Statistical Modelling –Emphasis on different methods (e.g. Differential Equations) –Different modelling environments (J. Nelder) Data structures in statistics Exploratory nature of statistical analysis (statistical analysis cycle) Competence of users

35
35 Computational Statistics and Computer Science Developments in Statistical Software Development of Statistical Languages Developments in Statistical Database Management

36
36 Computational Statistics and Computer Science Developments in Statistical Software (1) –From numerical subroutines towards statistical packages –Main goals: Taking into account the peculiarities of statistical data analysis Usage of actual hardware developments

37
37 Computational Statistics and Computer Science Developments in Statistical Software (2) –COMPSTAT was from the beginning onwards an important forum for the development of statistical software The proceedings in the beginning of the eighties show numerous software developments for specific statistical models There was always some tension in connection with presentation of commercial software developments and the scientific character of the conference

38
38 Computational Statistics and Computer Science Development of Statistical Languages (1) –GLIM was probably the first genuine statistical modelling language Present at COMPSTAT from the very beginning

39
39 Computational Statistics and Computer Science Development of Statistical Languages (2) –The S language set up a new paradigm for computing which is of interest also outside statistical applications Contribution in Computer Science honoured by the ACM Software System Award for J. Chambers Also it started already in 1976 it took a long time to enter the COMPSTAT community

40
40 Computational Statistics and Computer Science Development of Statistical Languages (3) –R got rather fast popularity inside COMPSTAT due to free availability and effective organisation of CRAN –Omegahat: An umbrella for open source projects in computational statistics covering not only statistical computation but also other important aspects in distributed computing

41
41 Computational Statistics and Computer Science Development of Statistical Languages (4) –XLISP-Stat as proof of concept (in particular for animated graphics) –XploRe as Java based production system

42
42 Computational Statistics and Computer Science Statistical Data Base Management –Main challenge is appropriate usage of the developments in database technology in statistical context Combination of statistical data structures and statistical processing activities with conceptual data models Representation of tabular data Metadata as a tool to capture the complexity of statistical data A small but active group inside the COMPSTAT community from the very beginning

43
43 Computational Statistics and Applications Challenges for Computational Statistics Rather independent from application area –Data Data capture Data structures Data size –Analysis Process Analysis strategies The role of the statistician in the computer age

44
44 Computational Statistics and Applications Data challenges (1) –Contributions towards data challenges occur occasionally at COMPSTAT Actual problems –Data capture Data capture tools are rather a side branch of computational statistics and more connected to official statistics A new challenge are data streams which have up to now attracted not so much attention in the computational statistics community

45
45 Computational Statistics and Applications Data challenges (2) –Data structures New problems (e.g. in connection with data mining) raise questions with respect to the applicability of the basic statistical analysis paradigm (population, sample, measurement process) –Data size Handling huge datasets All these challenges seem to be at the moment not core topics of computational statistics

46
46 Computational Statistics and Applications Analysis process –Analysis strategies The question of formalization of analysis strategies was a hot topic at the COMPSTAT conferences in the end of the 80ies, but there was limited success –The role of statisticians in the computer age Is progress in computational statistics an enabler for statisticians or leads it towards a de-skilling of the statistical profession?

47
47 The COMPSTAT Symposia

48
48 A full set of COMPSTAT proceedings (one statistical outlier removed) Do you see the CSDA volumes in the background ? Here they are !

49
49 The COMPSTAT Symposia I SymposiumYearOrganizers# Sub- missions # Papers I/C # Particip- ants Vienna1974Sint50100 Berlin1976Gordesch Naeve Leiden1978Corsten Hermans Edinburgh1980Barrit Wishart 2504/82750 Toulouse1982Caussinus Ettinger Tomassone 25015/60500

50
50 The COMPSTAT Symposia II SymposiumYearOrganizers# Sub- missions # Papers I/C # Particip- ants Prag1984Havranek Sidak Novak 3007/65??? Rome1986De Antoni Lauro Rizzi 30014/60900 Copenhag- en 1988Edwards Raun 3009/51800 Dubrovnik1990Momirovic1156/43180 Neuchâtel1992Dodge Whittaker 11511/115200

51
51 COMPSTAT 1994 Vienna and Satellite Meeting on Smoothing Semmering (World Cultural Heritage) Randy Eubank Andrew Westlake, Allmut Hörmann, Wolfgang Härdle

52
52 On the track from Vienna to Semmering in the Austrian Alps (historical train) The organizer

53
53 Satellite Meeting on Smoothing We finally arrived at the mountain spa Semmering Antoine de Falguerolles and the organizer at the opening

54
54 The COMPSTAT Symposia III SymposiumYearOrganizers# Sub- missions # Papers I/C # Particip- ants Vienna Semmring (Satellite) 1994Dutter Grossmann Schimek /60 7/ Barcelona1996Prat25013/56300 Bristol1998Payne Green 18012/58370 Utrecht2000Van der Heijden Bethlehem 25015/60220 Berlin2002Härdle2209/90260

55
55 The COMPSTAT proceedings from the Vienna and Semmering meetings Model of Vienna University Kastalia Fountain

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google