Towards Objective Ranking of Project Proposals Miroslav Kárný Department of Adaptive Systems Institute of Information Theory and Automation Academy of.

Slides:



Advertisements
Similar presentations
Active Appearance Models
Advertisements

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
Chap 10: Summarizing Data 10.1: INTRO: Univariate/multivariate data (random samples or batches) can be described using procedures to reveal their structures.
Department of Industrial Management Engineering 1.Introduction ○Usability evaluation primarily summative ○Informal intuitive evaluations by designers even.
Model Assessment, Selection and Averaging
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
1 Statistical Inference H Plan: –Discuss statistical methods in simulations –Define concepts and terminology –Traditional approaches: u Hypothesis testing.
Active Calibration of Cameras: Theory and Implementation Anup Basu Sung Huh CPSC 643 Individual Presentation II March 4 th,
Fitting. We’ve learned how to detect edges, corners, blobs. Now what? We would like to form a higher-level, more compact representation of the features.
Effective Gaussian mixture learning for video background subtraction Dar-Shyang Lee, Member, IEEE.
Robust Statistics Robust Statistics Why do we use the norms we do? Henrik Aanæs IMM,DTU A good general reference is: Robust Statistics:
Chapter 5 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
(1) A probability model respecting those covariance observations: Gaussian Maximum entropy probability distribution for a given covariance observation.
Fitting.
Helsinki University of Technology Adaptive Informatics Research Centre Finland Variational Bayesian Approach for Nonlinear Identification and Control Matti.
Section 2: Science as a Process
Extension to ANOVA From t to F. Review Comparisons of samples involving t-tests are restricted to the two-sample domain Comparisons of samples involving.
Evaluation of Quality of Learning Scenarios and Their Suitability to Particular Learners’ Profiles Assoc. Prof. Dr. Eugenijus Kurilovas, Vilnius University,
Calibration Guidelines 1. Start simple, add complexity carefully 2. Use a broad range of information 3. Be well-posed & be comprehensive 4. Include diverse.
by B. Zadrozny and C. Elkan
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Mobile Robotics Laboratory Institute of Systems and Robotics ISR – Coimbra 3D Hand Trajectory Segmentation by Curvatures and Hand Orientation for Classification.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
REVISED CONTEXTUAL LRT FOR VOICE ACTIVITY DETECTION Javier Ram’ırez, Jos’e C. Segura and J.M. G’orriz Dept. of Signal Theory Networking and Communications.
Measuring Complex Achievement
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Geographic Information Science
9 March 06 Who is going to read my (NSF) proposal? brief remarks to the WHOI Postdoctoral Association Jim Price Writing a better.
Small Area Health Insurance Estimates (SAHIE) Program Joanna Turner, Robin Fisher, David Waddington, and Rick Denby U.S. Census Bureau October 6, 2004.
Where did plants and animals come from? How did I come to be?
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
… welcome to the 10 th PhD workshop C & I Miroslav Kárný Department of Adaptive Systems Institute of Information Theory and Automation Academy of Sciences.
A Process Control Screen for Multiple Stream Processes An Operator Friendly Approach Richard E. Clark Process & Product Analysis.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
1 / 12 Michael Beer, Vladik Kreinovich COMPARING INTERVALS AND MOMENTS FOR THE QUANTIFICATION OF COARSE INFORMATION M. Beer University of Liverpool V.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
WERST – Methodology Group
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Business Statistics: A First Course (3rd Edition)
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
Some Alternative Approaches Two Samples. Outline Scales of measurement may narrow down our options, but the choice of final analysis is up to the researcher.
Edge Based Segmentation Xinyu Chang. Outline Introduction Canny Edge detector Edge Relaxation Border Tracing.
Machine Learning 5. Parametric Methods.
NTU & MSRA Ming-Feng Tsai
Lesson Overview Lesson Overview What Is Science?.
Introduction to statistics Definitions Why is statistics important?
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
University of Colorado Boulder ASEN 5070 Statistical Orbit determination I Fall 2012 Professor George H. Born Professor Jeffrey S. Parker Lecture 9: Least.
1 Ensuring Assessment Consistency Dr. Jalal Kawash and Dr. Robert Collier May 13, 2014 Department of Computer Science.
Fitting.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Chapter 1 The Science of Biology. Goals of Science to provide natural explanations for events in the natural world. to use those explanations to understand.
Markov Chain Monte Carlo in R
Ch3: Model Building through Regression
Parameter estimation class 5
Statistical Methods Carey Williamson Department of Computer Science
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Aims Research aim to provide internally consistent, practically applicable, methodology of dynamic decision making (DM) Talk aims to provide DM-based.
Statistical Thinking and Applications
Lesson Overview 1.1 What Is Science?.
Presentation transcript:

Towards Objective Ranking of Project Proposals Miroslav Kárný Department of Adaptive Systems Institute of Information Theory and Automation Academy of Sciences of the Czech Republic

… speaker’s home institute … speaker’s home institute Cybernetics  Communication & Control in Machines & Animals Cybernetics is speaker’s research domain and led to applications in: Cybernetics is speaker’s research domain and led to applications in: Adaptive control of paper machines, rolling mills, drum boilers,… Adaptive control of paper machines, rolling mills, drum boilers,… Nuclear medicine modeling & DM, dynamic image studies … Nuclear medicine modeling & DM, dynamic image studies … Support of operators of complex systems (FET) Support of operators of complex systems (FET) Traffic control in cities, optimization of financial strategies Traffic control in cities, optimization of financial strategies Multiple participants’ DM and E-democracy … Multiple participants’ DM and E-democracy … …? …! …? …! Bayesian DM: single-horse on decades-lasting trip with a good team Bayesian DM: single-horse on decades-lasting trip with a good team … nickname for Institute of Information Theory and Automation

FET organizes a review process … … to select the best proposals p among all submitted proposals An expert e assigns marks e m p  {0,…,M} to several proposals within a small group e p of proposals An expert e assigns marks e m p  {0,…,M} to several proposals within a small group e p of proposals A small group of experts p e, reviewing the proposal p, harmonizes the final mark m p via discussion A small group of experts p e, reviewing the proposal p, harmonizes the final mark m p via discussion Assembly of all experts completely ranks all proposals Assembly of all experts completely ranks all proposals EC supports top proposals up to a budget-implied border-line

Addressed problem Procedure is good & fair Each expert e has studied a tiny portion of all proposals Each expert e has studied a tiny portion of all proposals Experts’ marks e m p are subjectively scaled Experts’ marks e m p are subjectively scaled Discrete-valued marks cause many coincidences Discrete-valued marks cause many coincidences Time slot of the assembly is strongly limited Time slot of the assembly is strongly limited errorsmanipulationexpenses … up to the extremely disturbing step  An expert e assigns marks e m p  {0,…,M} to several proposals An expert e assigns marks e m p  {0,…,M} to several proposals within a small group e p of proposals within a small group e p of proposals A small group of experts p e, reviewing the proposal p, harmonizes the final mark m p via discussion A small group of experts p e, reviewing the proposal p, harmonizes the final mark m p via discussion Assembly of all experts completely ranks all proposals Assembly of all experts completely ranks all proposals

Aims to test belief that Bayesian DM is (almost) universal tool relying on the proper modeling only to test belief that Bayesian DM is (almost) universal tool relying on the proper modeling only to test a promising negotiation methodology needed in other contexts, too to test a promising negotiation methodology needed in other contexts, too to help FET to be fair and cost-efficient to help FET to be fair and cost-efficient to help proposing researchers to be treated fairly to help proposing researchers to be treated fairly to share fun (?) from the conclusions to share fun (?) from the conclusions … of the research … of the talk

Basic idea Experts serve as rank-measuring devices Ranking  estimation of rank r p from marks e m p, which are noise-corrupted observations of the objective rank Project proposal p has its objective rank r p !

Guide Experts as measuring devices Experts as measuring devices Prior knowledge Prior knowledge MAP estimate MAP estimate Experimental results Experimental results Discussion Discussion

Experts as measuring devices e m p … mark of proposal p by the expert e = r p … objective rank of proposal p r p … objective rank of proposal p + e  … personal error e  … personal error e  … personal error = e b … bias e b … bias + e  … personal fluctuations with variance e v e  … personal fluctuations with variance e v Simplicity & maximum entropy  e  assumed to be Gaussian experts try to be fair  mark e m p proportional to r p e  independent of p interpretation of marks top M  Nobel Prize top M  flawless

Prior knowledge e m p = r p + e b + e  = (r p – C) + ( e b + C) + e , for any C e m p = r p + e b + e  = (r p – C) + ( e b + C) + e , for any C Needed Available bias e b  [-M, M ], noise variance e v  [0, M 2 ] noise variance e v  [0, M 2 ] rank  [0, largest mark]  r p  [0, M ]  1 – 2  1 – 2 number of data number of data number of unknowns number of unknowns

MAP estimate Posterior log-likelihood function smoothly dependent on the estimated r, b, v smoothly dependent on the estimated r, b, v concave in the estimated r, b, v concave in the estimated r, b, v defined on a convex domain defined on a convex domain harmonised domain and data range harmonised domain and data range Evaluation Conditions for extreme are solved by successive approximations Conditions for extreme are solved by successive approximations  unique maximum  maximum in interior … fast, simple and reliable … can be used “on-line” … fast, simple and reliable … can be used “on-line”

Experiments - proposals’ viewpoint Processed marks m  {0, 0.5,…,30}; Assembly ranking available #Proposal #Proposal #Experts #Experts acceptance Threshold acceptance Threshold #proposals above T by A #proposals above T by A #proposals above T by us #proposals above T by us #proposals chosen by A and us #proposals chosen by A and us #common acceptance / A-one [%] #common acceptance / A-one [%] Extreme cases: typical numbers typical numbers prior does not spoil results with a few data prior does not spoil results with a few data

Histogram of rank estimates … box width about 2% of the mark range ! … box width about 2% of the mark range ! #(r >T  22) = 11 #(r >T  22) = 11 #(r>T  25) = 57

Experiments - experts’ viewpoint mean (bias) / Threshold [%] 6 4 mean (bias) / Threshold [%] 6 4 minimum (bias) / T minimum (bias) / T maximum (bias) / T maximum (bias) / T mean (std. dev.) / T mean (std. dev.) / T minimum (std. dev.) / T 10 7 minimum (std. dev.) / T 10 7 maximum (std. dev.) / T maximum (std. dev.) / T Box width containing significant number of proposals  3 % of T ! Box width containing significant number of proposals  3 % of T !

Individual results – small file

Individual top results – extensive file

Discussion it works it works it exhibits fast and reliable convergence it exhibits fast and reliable convergence it is reasonably robust to variations of prior statistics it is reasonably robust to variations of prior statistics Operational aspects Evaluation aspects it can substitute or at least support assembly ranking it can substitute or at least support assembly ranking it allows continuous-valued marking it allows continuous-valued marking it avoids the need to harmonize marks within p e it avoids the need to harmonize marks within p e it makes ranking less sensitive to experts’ biases & variations it makes ranking less sensitive to experts’ biases & variations it suppresses lottery-type results for gray-zone-ranked it suppresses lottery-type results for gray-zone-ranked proposals (those with the rank around threshold) proposals (those with the rank around threshold) it makes evaluation more objective it makes evaluation more objective

Discussion it checks reliability of experts, using their biases & variances: it checks reliability of experts, using their biases & variances: Methodological aspects [%] experts o.k. but unreliable or cheating rest still forms a significant portion Quality assurance aspects it allows tracking of “bad” experts it allows tracking of “bad” experts it opens a way to relate prior & posterior ranking, i.e., the achieved results of supported projects it opens a way to relate prior & posterior ranking, i.e., the achieved results of supported projects it can be tailored to other problems it can be tailored to other problems it can serve as a tool supporting negotiation it can serve as a tool supporting negotiation

Future alternative models of experts, e.g., non-normal, Markov-chain type alternative models of experts, e.g., non-normal, Markov-chain type comparison of prior and posterior ranking comparison of prior and posterior ranking application to other negotiation-type processes application to other negotiation-type processes application to individual marks & thresholds application to individual marks & thresholds quality assurance of the evaluation including experts’ competence ! quality assurance of the evaluation including experts’ competence !