Download presentation
Presentation is loading. Please wait.
Published byColin Harrington Modified over 9 years ago
1
Moscow April 27, 2016 The use of indicators for evaluation and policy making Giorgio Sirilli Research Associate
2
2 2 2 Outline of the presentation STI indicators Science and technology policy Evaluation the need the socio-political dimension cost experience of Italy impact on universities Concluding remarks
3
3 3 3 Indicators Statistic: A numerical fact or datum, i.e. one computed from a sample. Statistical data: Data from a survey or administrative source used to produce statistics. Statistical indicator: A statistic, or combinations of statistics, providing information on some aspect of the state of a system or of its change over time. (For example, gross domestic product (GDP) provides information on the level of value added in the economy, and its change overtime is an indicator of the economic state of the nation.)
4
4 4 4 Indicators Indicators are a technology, a product, which - governs behaviour - is modified by users (outside of the producer community) - develops in response to user needs Data sources – Surveys, administrative data, private files, case studies – Data collection is informed by manuals Data populate statistics which can be indicators Decisions are taken on the basis of indicators
5
5 5 5 S&T indicators are defined as “a series of data which measures and reflects the science and technology endeavor of a country, demonstrates its strengths and weaknesses and follows its changing character notably with the aim of providing early warning of events and trends which might impair its capability to meet the country’s needs”. Indicators can help “to shape lines of argument and policy reasoning. They can serve as checks, they are only part of what is needed”. (OECD, 1976) Indicators
6
6 6 6 Research and development (R&D) Innovation statistics Intangible investment Patents Technology balance of payments Trade of high-tech products Human resources Venture capital Bibliometrics Public perception of science and technology ……….. Indicators
7
7 7 7 Keith Pavitt “One would think that the political agenda determines the collection and analysis of indicators. In reality it is the other way round: it is the availability of indicators which steers the political discourse.”
8
8 8 8 Fred Gault “Policy analysts should be both literate and numerate, able to put a case using innovation indicators. Not only should the analysits have such a skill set, but they also require some knowledge of the subject. It is in this environment that monitoring, benchmarking and evaluation lead to policy learning and to more effective policies.”
9
9 9 9 The user of indicators: an acrobat
10
10 The producer of indicators “The sorcerer's apprentice” ( колдун) Sorcerer “An irresponsible person who instigates a process or project which is unable to control, risking to produce irreversible damage.” Wolfgang Goete
11
11 S&T statisticians: arms producers? “I have no regret. Others are responsible for the bloodshed caused by the AK-47 machine gun. It is politicians’ fault non to be able to find appropriate solutions but rather to resort to violence.” (M. Kalashnikov)
12
12 A brief history of S&T indicators The first attempt to measure S&T in 1957 Frascati Manual (1963) The Frascati manual “family” A continous process of broadening and deepening: from macro to micro, from public to private The role of international organisations The dialogue between producers and users
13
13 R&D resources OECD Science, Technology and Industry Scoreboard, 2015
14
14 The world is changing Source: OECD. STI Outlook 2014 EU28 US BRIICS CHINA JAPAN
15
15 A changing global R&D landscape GERD, million USD 2005 PPP, 2000-12 and projections to 2024 Source: OECD estimates based on OECD MSTI database, June 2014.
16
16 Pubic funding to R&D and innovation
17
17 Pubic funding to R&D and innovation
18
18 R&D intensity Source: OECD. STI Outlook 2014
19
19 The mistique of ranking GERD is used for target setting - from descriptive to prescriptive “The American GERD/GDP ratio of the early 1960s, that is 3%, as mentioned in the first paragraphs of the first edition of the Frascati Manual, became the ideal to which member countries would aim, and which the OECD would implicitly promote” (Godin) Lisbona EU 3% (2% business, 1% public sector)
20
20 The R&D/GDP objective Source: OECD, estimetaes based on OECD MSTI database, June 2014. Objectives of R&D expenditure and differences with present levels - 2014
21
21 A rhetoric device: A pleatora of figures and graphs “Secure a quantitative statement of the critical elements in an official’s problem, draw it up in concise form, illuminate the tables with a chart or two, bind the memorandum in an attractive cover tied with a neat bow-knot (…). The data must be simple enough to be sent by telegraph and compiled overnight” (Mitchell, 1919)
22
22 Is technological progress slowing down?
23
23 R&D in universities and public research agencies
24
24 Science and technology policy Science policy is an area of public policy which is concerned with the policies that affect the conduct of the science and research enterprise, including the funding of science, often in pursuance of other national policy goals such as technological innovation to promote commercial product development, weapons development, health care and environmental monitoring. (Wikipedia)
25
25 Patronage of rulers (ex. in the Rennaissance) Industrial revolution Between the First and the Second World Wars rockets, nuclear energy, operations research, DDT After the Second World War science and technology policy of governments A brief history of science and technology policy
26
26 Science and technology policy Vannevar Bush “Science the Endless Frontier” 1945
27
27 “ Science the Endless Frontier” Issues to be addressed through science: - difence - health Solution: science policy
28
28 NABS objectives Exploration and exploitation of the earth Environment Exploration and exploitation of space Transport, telecommunication and other infrastructures Energy Industrial production and technology Health Agriculture Education Culture, recreation, religion and mass media Political and social systems, structures and processes General advancement of knowledge: R&D financed from general university funds General advancement of knowledge: R&D financed from other sources than GUF Defence
29
29
30
30 Evaluation Evaluation may be defined as an objective process aimed at the critical analysis of the relevance, efficiency, and effectiveness of policies, programmes, projects, institutions, groups and individual researchers in the pursuance of the stated objectives. Evaluation consists of a set of coordinated activities of comparative nature, based on formalised methods and techniques through codified procedures aimed at formulating an assessment of intentional interventions with reference to their implementation and to their effectiveness. Internal/external evaluation
31
31 A question (doubt) The scientific community has always been evaluated (mostly inside) When you measure a system, you change the system
32
32 In the UK Research Assessment Exercise (RAE) Research Excellence Framework (REF) (impact) “The REF will over time doubtless become more sophisticated and burdensome. In short we are creating a Frankenstein monster” (Ben Martin)
33
33 Why do we need evaluation? Need for a coherent strategy with clear priorities Need for new/improved tools to help determine: - which current activities worth keeping - which to cut back to allow new things to emerge -effective (evidence-based) policies for basic/strategic research
34
34 Types of desisions in science policy Distribution between sciences (e.g. physics, social sciences) Distribution between specialties e.g. high-energy physics, optical physics Distribution between different types of activity e.g. university research, postgraduates, central labs Distribution between centres, groups, individuals
35
35 Scope and object of evaluation Type of research e.g. - academic research vs targeted research - international big-science programmes Level and object of the evaluation - individual researcher - research group - project-centred - programme - whole discipline
36
36 Criteria for evaluation Vary according to the scope and purpose of evaluation; they range from criteria for identifying quality/impact of research to criteria for identifying value for money Four main aspects often distinguished - quantity - quality - impact - utility Criteria can be - internal – likely impact on advance of knowledge - external – likely impact on other S&T fields, economy and society
37
37 Research evaluation Evaluation of what: research education “third mission” of universities and research agencies (consultancy, support to local authorities, etc.) Evaluation by whom: experts, peers Evaluation of what: organisations (departments, universities, schools) programmes, projects individuals (professors, researchers, students) Evaluation when ex-ante in-itinere ex-post
38
38 Evaluation in Italy Established in 2011 A government agency, not an authority The relationship with MIUR (Ministry of education, universities and research) ANVUR activities: 1. Evaluation of the Quality of Research (EQR) 2. National Scientific Qualification (NSQ) 3. Accreditation of universities (AVA)
39
39 Evaluation of the Quality of Research by ANVUR Model: Research Assessment Exercise (RAE) Objective: Evaluation of Areas, Research structures and Departments (not of researchers) Reference period: 2004-2010 Start: 2011 Finish: 2014 Actors: - ANVUR (National Agency for the Evaluation of Universities and Research Institutes) - GEV (Evaluation Groups) (#14) (450 experts involved plus referees) - Research structures (universities, research agencies) - Departments - Subjects evaluated: researchers (university teachers and PRA researchers)
40
40 Researchers’ products to be evaluated - journal articles - books and book chapters - patents - designs, exhibitions, software, manufactured items, prototypes, etc. University teachers: 3 “products” over the period 2004-2010 Public Research Agencies researchers: 6 “products” over the period 2004-2010 Scores: from 1 (excellent) to -1 (missing) Evaluation of the Quality of Research by ANVUR
41
41 Attention basically here! Indicators linked to research: quality(0,5) ability to attract resources(0,1) mobility (0,1) internazionationalisation(0,1) high level education(0,1) own resources(0,05) improvement(0,05) Evaluation of the Quality of Research by ANVUR
42
42 Indicators of the “third mission” : fund raising(0,2) patents(0,1) spin-offs (0,1) incubators(0,1) consortia (0,1) archaeological sites (0,1) museums(0,1) other activities (0,2) Evaluation of the Quality of Research by ANVUR A methodological failure
43
43 From science to innovation Source: WIPO, World Intellectual Property Report, 2015 Average adoption lags have declined markedly over the past 200 years
44
44 Impact of evaluation in Italy Percentage of the General university fund linked to increase of quality, efficiency and effectiveness based on the Evaluation of the Quality of Research (VQR) (for three fifths) and on recruitment policy (one fifth): 2014: 16.0% 2015: 18.0% 2017: 20.0% later: up to 30%
45
45 The impact of VQR on professors/researchers VQRVQR BEFORE YOU GO … WOULD YOU MIND FILLING THIS OUT FOR ME?
46
46 Lessons from Research Evaluation in Italy Evaluation should enhance efficiency and effectiveness Pro-active evaluation vs punitive evaluation Evaluation is a difficult and expensive process When a system is measured it is changed (opportunistic behaviour) Peer review vs bibliometrics NSE vs SSH Competition vs cooperation of scientists The mith of excellence The split of the academic community (the good and the bad guys) The equilibrium amongst the teaching, research and third mission Bureacratisation Used to assign resources by ministry Evaluation in times of crisis
47
47 Evaluation is an expensive exercise Rule of thumb: less than 1% of R&D budget devoted to its evaluation Evaluation of the Quality of Research (VQR) 300 million Euro (180,000 “products”) 182 million Euro Research Assessment Exercise (RAE) 540 million Euro Research Excellence Framework (REF) 1 milllion Pounds (500 million)
48
48 The new catchwords New public management Value for money Accountability Relevance Excellence
49
49 Evaluation Effectiveness. Whether the objectives were achieved, and whether their achievement was sufficient to change the original problem situation Value for money. The extent to which benefits exceed costs Efficiency. The cost at which objectives were achieved Appropriateness. Whether a policy or programme was suitable for the problem situation
50
50 Encyclical letter. Pope Francis Economic powers continue to justify the current global system where priority tends to be given to speculation and the pursuit of financial gain with its search for immediate interest, which fail to take the context into account, let alone the effects on human dignity and the natural environment. The technocratic paradigm tends to dominate economic and political life. The economy accepts every advance in technology with a view to profit, without concern for its potentially negative impact on human beings. Finance overwhelms the real economy. Our politics are subject to technology and finance.
51
51 How artists look at reality (Museum of Modern Arts)
52
52
53
53
54
54 Some indicators Number of publications Number of citations Impact factor h-index
55
55 Scientific publishing: A big business Scientific publishing 10 billion $ budget (55% in the USA, 28% in Europe) 8 million authors 110,000 employees (40% in Europe) 30,000 journals 2.5 million articles published yearly Misconduct (falsification, plagiarism, fraud, etc.) Retraction
56
56 Over recent years, Journal Impact factor (JIF) has become the most prominent indicator of a journal's standing, bringing intense pressure on journal editors to do what they can to increase it. Which are the approaches employed by journal editors to maximise it? Some approaches would seem completely acceptable, others are in clear breach of the conventions on academic behavior, but a number fall somewhere in between. Over time, editors have devised ingenious ways of enhancing their JIF without apparently breaching any rules. The editorial draws three conclusions. First, in the light of ever more devious ruses of editors, the JIF indicator has now lost most of its credibility. Secondly, where the rules are unclear or absent, the only way of determining whether particular editorial behavior is appropriate or not is to expose it to public scrutiny. Thirdly, editors who engage in dubious behavior thereby risk forfeiting their authority to police misconduct among authors. Editorial in Research Policy by Ben Martin
57
57 Ranking of universities Four major sources of ranking ARWU Shangai (Shangai, Jiao Tong University) QS World University Ranking THE University Ranking (Times Higher Education) US News e World Reports (Best Global Universities)
58
“Starting from 2003, ARWU has been presenting the world Top 500 universities annually based on a set of objective indicators and third-party data. ARWU has been recognized as the precursor of global university rankings and the most trustworthy league table. ARWU adopts six objective indicators to rank world universities, including: -the number of alumni and staff winning Nobel Prizes and Fields Medals, -the number of Highly Cited Researchers, -the number of articles published in journals of Nature and Science, -the number of articles indexed in Science Citation Index - Expanded and Social Sciences Citation Index, and -per capita performance. More than 1200 universities are actually ranked by ARWU every year and the best 500 universities are published.” Academic Ranking of World Universities Shanghai Jiao Tong University
59
59 Global rankings cover less than 3-5% of the world universities
60
60 Ranking of universities: the case of Italy ARWU Shangai (Shangai, Jiao Tong University) QS World University Ranking THE University Ranking (Times Higher Education) US News e World Reports (Best Global Universities) ARWU Shangai: Bologna 173, Milano 186, Padova 188, Pisa 190, Sapienza 191 QS World University Ranking: Bologna 182, Sapienza 202, Politecnico Milano 229 World University Ranking SA: Sapienza 95, Bologna 99, Pisa 184, Milano 193 US News e World Report: Sapienza 139, Bologna 146, Padova 146, Milano 155
61
61 The rank-ism (De Nicolao)
62
62 San Francisco Declaration on Research Assessment General Recommendation Do not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions. San Francisco Declaration on Research Assessment
63
63 San Francisco Declaration on Research Assessment The Journal Impact Factor, as calculated by Thomson Reuters, was originally created as a tool to help librarians identify journals to purchase, not as a measure of the scientific quality of research in an article. With that in mind, it is critical to understand that the Journal Impact Factor has a number of well-documented deficiencies as a tool for research assessment. These limitations include: A) citation distributions within journals are highly skewed; B) the properties of the Journal Impact Factor are field-specific: it is a composite of multiple, highly diverse article types, including primary research papers and reviews; C) Journal Impact Factors can be manipulated (or “gamed”) by editorial policy; and D) data used to calculate the Journal Impact Factors are neither transparent nor openly available to the public.
64
64 The Leiden manifesto on bibliometrics
65
65 Bibliometrics under fire
66
66 The Leiden Manifesto Bibliometrics: The Leiden Manifesto for research metrics “Data are increasingly used to govern science. Research evaluations that were once bespoke and performed by peers are now routine and reliant on metrics. The problem is that evaluation is now led by the data rather than by judgement. Metrics have proliferated: usually well intentioned, not always well informed, often ill applied. We risk damaging the system with the very tools designed to improve it, as evaluation is increasingly implemented by organizations without knowledge of, or advice on, good practice and interpretation.”
67
67 The Leiden principles in the “making” evaluation 1. Should not substitute for judgment (no use without thinking) 2. Should match institutional mission (horses for courses) 3. Should be parsimonious and transparent 4. Underlying data should be accessible and verifiable by evaluated 5. Should take into account field and country differences/contexts 6. Individual research should be assessed with both multidimensional metrics and qualitative data 7. Intended and unintended effects of metrics should be reflected upon before use (Hicks, Wouters et al., 2014)
68
68 The impact of valuation on higher education The scientific community has always been evaluated (mostly inside) When you measure
69
69 Changes in university life The university has become at the mercy of: - increasing bibliometric measurement - quality standards - blind refereeing (someone sees you but you do not see him) - bibliometric medians - journal classifications (A, B, C, …) - opportunistic citing - academic tourism - administrative burden - …….
70
70 Interview of Italian researchers (40-65 years old) Main results: A drastic change of researchers’ attitude due to the introduction of bibliometrics-based evaluation The bibliometrics-based evaluation has an extremely strong normative function on scientific practices, which deeply impact the epistemic status of the disciplines The epistemic consequences of bibliometrics-based evaluation (T. Castellani, E. Pontecorvo, A. Valente, Epistemological consequences of bibliometrics: Insights from the scientific community, Social Epistemology Review and Reply Collective vol. 3 no. 11, 2014).
71
71 Results 1. The bibliometrics-based evaluation criteria changed the way in which scientists choose the topic of their research: -choosing a fashionable theme -placing the article in the tail of an important discovery (bandwagon effect) -choosing short empirical papers 2. The hurry 3. Interdisciplinary topics are hindered. Bibliometric evaluative systems encourage researchers not to change topic during their career 4. repetition of experiments is discuraged. Only new results are considered interesting (T. Castellani, E. Pontecorvo, A. Valente, Epistemological consequences of bibliometrics: Insights from the scientific community, Social Epistemology Review and Reply Collective vol. 3 no. 11, 2014). The epistemic consequences of bibliometrics-based evaluation
72
72 “Higher education’s silent killer” (Mark Spooner) New public management The audit culture “As a direct result of successive governments’ chronic underfunding of post-secondary education, the traditional university is being transformed away from an accessible institution dedicated to fostering critical, creative, and engaged citizens while generating public-interest research, to an entrepreneurial training centre churning out atomized workers and corporate-directed “R&D.”
73
73 The new catchwords
74
74 The new catchwords
75
75 “Higher education’s silent killer” (Mark Spooner) A complex incentive scheme was introduced, with the collaboration of the universities, to simulate market competition but in reality it looked more like Soviet planning. Just as the Soviet planners had to decide how to measure the output of their factories, how to develop measures of plan fulfilment, so now universities have to develop elaborate indices of output, KPIs (key performance indicators), reducing research to publications, and publications to refereed journals, and refereed journals to their impact factors. Just as Soviet planning produced absurd distortions, heating that could not be switched off, shoes that were supposed to suit everyone, tractors that were too heavy because targets were in tons, or glass that was too thick because targets were in volume, so now the monitoring of higher education is replete with parallel distortions that obstruct production (research), dissemination (publication) and transmission (teaching) of knowledge.
76
76 Publish or perish Science today is tormented by perverse incentives: - Researchers judge one another not by the quality of their science — who has time to read all that? — but by the pedigree of their journal publications. - High-profile journals pursue flashy results, many of which won’t pan out on further scrutiny. - Universities reward researchers on those publication records. - Financing agencies, reliant on peer review, direct their grant money back toward those same winners. - Graduate students, dependent on their advisers and neglected by their universities, receive minimal, ad hoc training on proper experimental design, believing the system of rewards is how it always has been and how it always will be. Fracis Collins, Amid a sea of false findings. The NHI tries reform, Chronicle of Higher Education, 2015
77
77 Call for Papers for Philosophy and Technology’s special issue: Toward a Philosophy of Impact There was a time when serendipity played a central role in knowledge policy. Scientific advancement was viewed as essential for social progress, but this was paired with the assumption that it was generally impossible to steer research directly toward desired outcomes. Attempts to guide the course of research or predict its societal impacts were seen as impeding the advancement of science and thus of social welfare. Driven in part by budgetary constraints, and in part by ideology, the age of serendipity is being eclipsed by the age of accountability. Society increasingly requires academics to give an account of the value of their research. The ‘audit culture’ now permeates the university from STEM (science, technology, engineering, and math) through HASS (humanities, arts, and social sciences). Academics are being asked to consider not just how their work influences their disciplines, but also other disciplines and society more generally.
78
78 Model of firm’s management based on the principles of competitiveness and customer satisfaction (the market) The catchwords: competitiveness excellence meritocracy “Evaluative state” as the “minimum state” in which the government gives up the role of political responsibility and avoid the democratic debate in search of consensus, and rests on the “automatic pilot” of techno-administrative control. D: Borrelli, Contro l’ideologia della valutazione, 2015 Against the ideology of evaluation. How AVUR is killing university
79
79 Against the ideology of evaluation. How AVUR is killing university “ANVUR is much more than an administrative branch. It is the outcome of a cultural and political project aimed at reducing the range of alternatives and hampering pluralism.” D: Borrelli, Contro l’ideologia della valutazione, 2015
80
80 How to use effectively evaluation? Respect the internal logic and rules of the scientific community Pro-active evaluation, not punitive evaluation Keep the national S&T system anchored to the international system Reduce/avoid negative unintended effects (opportunism, etc.) Don’t ask science what science can’t deliver (e.g. jobs, competitiveness) Evaluation as dynamite: handle with care
81
81 Some concluding remarks S&T systems are under pressure Evaluation as a tool for the legitimation of R&D and higher education Evaluation has become a key policy instrument The ideology behind R&D evaluation Evaluation exercises heavily criticised from a methodological point of view Impact on the scientific community and on researchers’ behavior (“when you measure a system, you change the system”) Evaluation necessary … but too much evaluation is harmful Use and misuse of R&D of evaluation (concentration in”excellent” institutions or “spreading” the resources?) Evaluation is expensive Evaluation in a period of reduction of resources
82
82 Thank you for your attention
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.