Presentation is loading. Please wait.

Presentation is loading. Please wait.

The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria.

Similar presentations


Presentation on theme: "The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria."— Presentation transcript:

1 The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria Peter Paul Sint, Austrian Academy of Sciences, Vienna 30 Years

2 Department of Statistics and Informatics, University of Vienna Peter Paul, a „senior“ Assistant Professor

3 3 Department of Statistics and Informatics, University of Vienna 1974A few years after Wilfried, a „junior“ Assistant Professor

4 A few years after Michael, a first year student University of Vienna Gerhard Bruckmann

5 5 Outline of Presentation The Beginning of COMPSTAT Early statistical computing The institutional environment The first symposium and the Compstat Society Developments in Computational Statistics (CS) CS and statistical theory CS and algorithms CS and computer science CS and application The COMPSTAT Symposia

6 6 The Beginning of COMPSTAT

7 7 Early Computational Statistics The Beginnings in Vienna –Institute of Statistics Part of the Law Faculty - S. Sagoroff - Leipzig/Sofia/USA/Berlin//Vienna - Energy Balances first Computer: first generation machine –Paid for by Rockefeller-Foundation 1960 –Arrival of the ‚Electronic Brain‘ 1st generation »Never again similar enthusiasm Institute of Advanced Studies - Ford Institute –Statistical machines - card counting - >2nd generation Replaced by IBM / rd gen. SSP / SPSS –Computing Center

8 8 Statistics-Computational One year Biostatistics department Oxford University Still: Not strongly integrated in international statistical community - Main contacts ISI: Central Statistical Office, Sagoroff 1973 ISI-session in Vienna - emphasis on applications - computational methods rare Bring statisticians with our interests to Vienna Encouragement by publisher Arnulf Liebing /Physica/ What is specific to our department? Concept of Computational Statistics - Johannes Gordesch (Math) - Peter Paul Sint (Physics)

9 9 First COMPSTAT Call COMPSTAT Gerhart Bruckmann - Local fame as analyst of voting results during election nights -Leopold Schmetterer (successor of Sagoroff) - Internationally known Mathematical Statistician (Franz Ferschl, incoming professor of statistics, new editor of Metrika - added as an editor by the publisher)

10 10 S. Sagoroff and M. Tantilov

11 11 First COMPSTAT Editors

12 12 Preface of the first Proceedings

13 13 Logic of the Logo

14 14 J. Gordesch at Compstat76 Berlin

15 15 Getting of Age International from the start Compstat Society since Berlin Leiden NL 1978 Integration into IASC Edinburgh GB Toulouse F 1982 Eastern Europe needed Politics ISI-IASC Local Projects redirected: Prague 1984 Rome I Copenhagen 1988 DK Dubrovnik YU Neuchâtel CH 1992

16 16 Prague 1984

17 17 Developments in Computational Statistics

18 18 Computational Statistics What is Computational Statistics? –A question raised many times at the end of the 80ies and beginning of the 90ies inside the community

19 19 Computational Statistics Working definition (A. Westlake) Computational Statistics is related to the advance of statistical theory and methods through the use of computational methods. This includes both the use of computation to explore the impact of theories and methods, and development of algorithms to make these ideas available to users

20 20 Computational Statistics Computational Statistics Statistical Theory Algorithms Applications Computer Science Numerical Analysis Statistical Software Modelling Seminumerical Algorithms

21 21 Computational Statistics and Statistical Theory The statistical journey in the 20th century The Theory Era The Methodology Era

22 22 Computational Statistics and Statistical Theory The statistical journey in the 20th century –B. Efron: Statistics in the 20th century is a journey between three poles: Applications Mathematics Computation

23 23 Computational Statistics and Statistical Theory The Theory Era (Pearson, Neyman, Fisher, Wald) –From models for solving practical problems towards a mathematical decision theoretic framework –Based on optimality principles –Application is based on computations feasible for paper and pencil or mechanical computing devices

24 24 Computational Statistics and Statistical Theory Modelling Era (1) –Tukey’s paper about the future of data analysis (1962) as a turning point from mathematics towards computation Confirmatory versus explanatory analysis Dynamics of data analysis “Robustness” Importance of Graphics

25 25 Computational Statistics and Statistical Theory Modelling Era (2) –Important developments in the modelling era Nonparametric and Robust Methods Kaplan-Meier and Proportional Hazards Logistic Regression and GLM Jackknife and Bootstrap EM and MCMC Empirical Bayes and James-Stein Estimation

26 26 Computational Statistics and Statistical Theory Modelling Era (3) –The modelling area is characterized by a strong interplay between statistical theory and computational statistics –The computer as a workbench for statistical experiments (going back to v. Neumann and S. Ulam) Passive usage: Studying feasibility of statistical theory by simulation Active usage: Obtain results which cannot be computed by conventional numerical algorithms

27 27 Computational Statistics and Statistical Theory COMPSTAT was probably not always at the frontier of this developments but the programs and the proceedings reflect quite well the dynamics of the subject in the Modelling Era

28 28 Computational Statistics and Algorithms Numerical Algorithms –Matrix Computation, Optimization Random Numbers / Monte Carlo Semi-numerical Algorithms –Sorting, Searching, Combinatorial Methods, Graph Theoretic Algorithms,… Graphical Algorithms Symbolic Computation (?) Mathematical vs. Statistical Modelling

29 29 Computational Statistics and Algorithms Statistics and Numerical Algorithms (1) –Fast Fourier Transform (Tukey) –Recursive Algorithms and Filtering (Kalman Filter) (Both topics seem to be not core topics in computational statistics)

30 30 Computational Statistics and Algorithms Statistics in Numerical Algorithms (2) –Adaptation of optimization techniques (e.g. scoring methods) –Behaviour of optimization methods in statistical context (numerical convergence vs. stochastic convergence concepts) Implicit Consideration at COMPSTAT

31 31 Computational Statistics and Algorithms Statistics and Random Numbers / Monte Carlo –Generation of Random numbers was (and is) probably more a topic of mathematics (number theory) and computer science In the beginning of COMPSTAT there was also some connection to simulation –Genuine application of Monte Carlo Methods in connection with new developments of statistical theory (e.g. MCMC)

32 32 Computational Statistics and Algorithms Statistics and semi-numerical algorithms –Applications in context of nonparametric statistics and analysis of tabular data Feasibility of conditional inference for logistic models –New developments on the borderline between statistics and computer science Data Mining as a new statistical modelling paradigm COMPSTAT was open towards these developments and integrated it into the program

33 33 Computational Statistics and Algorithms Statistics and Graphical Algorithms –Development rather complementary to the developments of computer science, –Important issues (L. Wilkinson): Graphics are not only a tool for displaying results but rather a tool for perceiving relationships Dynamic graphics as important tool for data analysis Graphics are a means of model formalization reflecting quantitative and qualitative traits of its variables Represented quite well at COMPSTAT

34 34 Computational Statistics and Algorithms Mathematical vs. Statistical Modelling –Emphasis on different methods (e.g. Differential Equations) –Different modelling environments (J. Nelder) Data structures in statistics Exploratory nature of statistical analysis (statistical analysis cycle) Competence of users

35 35 Computational Statistics and Computer Science Developments in Statistical Software Development of Statistical Languages Developments in Statistical Database Management

36 36 Computational Statistics and Computer Science Developments in Statistical Software (1) –From numerical subroutines towards statistical packages –Main goals: Taking into account the peculiarities of statistical data analysis Usage of actual hardware developments

37 37 Computational Statistics and Computer Science Developments in Statistical Software (2) –COMPSTAT was from the beginning onwards an important forum for the development of statistical software The proceedings in the beginning of the eighties show numerous software developments for specific statistical models There was always some tension in connection with presentation of commercial software developments and the scientific character of the conference

38 38 Computational Statistics and Computer Science Development of Statistical Languages (1) –GLIM was probably the first genuine statistical modelling language Present at COMPSTAT from the very beginning

39 39 Computational Statistics and Computer Science Development of Statistical Languages (2) –The S language set up a new paradigm for computing which is of interest also outside statistical applications Contribution in Computer Science honoured by the ACM Software System Award for J. Chambers Also it started already in 1976 it took a long time to enter the COMPSTAT community

40 40 Computational Statistics and Computer Science Development of Statistical Languages (3) –R got rather fast popularity inside COMPSTAT due to free availability and effective organisation of CRAN –Omegahat: An umbrella for open source projects in computational statistics covering not only statistical computation but also other important aspects in distributed computing

41 41 Computational Statistics and Computer Science Development of Statistical Languages (4) –XLISP-Stat as proof of concept (in particular for animated graphics) –XploRe as Java based production system

42 42 Computational Statistics and Computer Science Statistical Data Base Management –Main challenge is appropriate usage of the developments in database technology in statistical context Combination of statistical data structures and statistical processing activities with conceptual data models Representation of tabular data Metadata as a tool to capture the complexity of statistical data A small but active group inside the COMPSTAT community from the very beginning

43 43 Computational Statistics and Applications Challenges for Computational Statistics Rather independent from application area –Data Data capture Data structures Data size –Analysis Process Analysis strategies The role of the statistician in the computer age

44 44 Computational Statistics and Applications Data challenges (1) –Contributions towards data challenges occur occasionally at COMPSTAT Actual problems –Data capture Data capture tools are rather a side branch of computational statistics and more connected to official statistics A new challenge are data streams which have up to now attracted not so much attention in the computational statistics community

45 45 Computational Statistics and Applications Data challenges (2) –Data structures New problems (e.g. in connection with data mining) raise questions with respect to the applicability of the basic statistical analysis paradigm (population, sample, measurement process) –Data size Handling huge datasets All these challenges seem to be at the moment not core topics of computational statistics

46 46 Computational Statistics and Applications Analysis process –Analysis strategies The question of formalization of analysis strategies was a hot topic at the COMPSTAT conferences in the end of the 80ies, but there was limited success –The role of statisticians in the computer age Is progress in computational statistics an enabler for statisticians or leads it towards a de-skilling of the statistical profession?

47 47 The COMPSTAT Symposia

48 48 A full set of COMPSTAT proceedings (one statistical outlier removed) Do you see the CSDA volumes in the background ? Here they are !

49 49 The COMPSTAT Symposia I SymposiumYearOrganizers# Sub- missions # Papers I/C # Particip- ants Vienna1974Sint50100 Berlin1976Gordesch Naeve Leiden1978Corsten Hermans Edinburgh1980Barrit Wishart 2504/82750 Toulouse1982Caussinus Ettinger Tomassone 25015/60500

50 50 The COMPSTAT Symposia II SymposiumYearOrganizers# Sub- missions # Papers I/C # Particip- ants Prag1984Havranek Sidak Novak 3007/65??? Rome1986De Antoni Lauro Rizzi 30014/60900 Copenhag- en 1988Edwards Raun 3009/51800 Dubrovnik1990Momirovic1156/43180 Neuchâtel1992Dodge Whittaker 11511/115200

51 51 COMPSTAT 1994 Vienna and Satellite Meeting on Smoothing Semmering (World Cultural Heritage) Randy Eubank Andrew Westlake, Allmut Hörmann, Wolfgang Härdle

52 52 On the track from Vienna to Semmering in the Austrian Alps (historical train) The organizer

53 53 Satellite Meeting on Smoothing We finally arrived at the mountain spa Semmering Antoine de Falguerolles and the organizer at the opening

54 54 The COMPSTAT Symposia III SymposiumYearOrganizers# Sub- missions # Papers I/C # Particip- ants Vienna Semmring (Satellite) 1994Dutter Grossmann Schimek /60 7/ Barcelona1996Prat25013/56300 Bristol1998Payne Green 18012/58370 Utrecht2000Van der Heijden Bethlehem 25015/60220 Berlin2002Härdle2209/90260

55 55 The COMPSTAT proceedings from the Vienna and Semmering meetings Model of Vienna University Kastalia Fountain


Download ppt "The History of Keysteps of Computational Statistics Wilfried Grossmann, University of Vienna, Austria Michael G. Schimek, Medical University of Graz, Austria."

Similar presentations


Ads by Google