Data and text mining workshop The role of crowdsourcing Anna Noel-Storr Wellcome Trust, London, Friday 6 th March 2015.

Slides:



Advertisements
Similar presentations
© Megaputer intelligence, Inc. Your Knowledge Partner Survey Analysis using PolyAnalyst TM.
Advertisements

Agenda Background Daily Alerting Service Search Tool Searching Online Additional resources Downloading to handheld Free.
A Proposal for Certification of Librarians as Partners in Systematic Reviews Pamela C. Sieving¹, Kay Dickersin², Roberta Scherer 2, & Ann-Margaret Ervin.
NIHR Research Design Service London Enabling Better Research Forming a research team Victoria Cornelius, PhD Senior Lecturer in Medical Statistics Deputy.
Embase screening Using technology to harness the wisdom of the crowd #CochraneTech 20 th September 2014 Hyderabad, India.
Applying Crowd Sourcing and Workflow in Social Conflict Detection By: Reshmi De, Bhargabi Chakrabarti 28/03/13.
Beyond PubMed: exploring other biomedical databases Linda Atkinson & Juliet Ralph
Machine Learning and Data Mining Course Summary. 2 Outline  Data Mining and Society  Discrimination, Privacy, and Security  Hype Curve  Future Directions.
Conducting systematic reviews for development of clinical guidelines 8 August 2013 Professor Mike Clarke
MI021/CS021: Computers in Management April 24, 2009 Peer Production, Social Media, and Web 2.0 Prof. John Gallaugher written case &
Crowdsourcing Gaurang Jadia CS575 Human Issues in Computing.
CfE Higher Physical Education
Crowdsourcing research data UMBC ebiquity,
CRM Chapter 9 Analytics. Analytics  Collection, extraction, modification, measurement, identification, and reporting of information designed to be useful.
NURS 505B Library Session Rachael Clemens Spring 2007.
1 Digital Libraries and Evidence in the Developing World Context Dr. Jon Ferguson Senior Health Database Scientist IMMPACT Project University of Aberdeen.
Evaluation of Image Retrieval Results Relevant: images which meet user’s information need Irrelevant: images which don’t meet user’s information need Query:
1 Open Innovation and Crowd Sourcing Platforms Robert Shaw, Head, Innovation Division, ITU-D.
Patients as Partners: at the Forefront of Service Redesign An Introduction to Patient Focus Public Involvement.
Healthy Aging & Participating in Research. Discoveries from research led to the medicines and treatments we take for granted today: Vaccines to prevent.
Evaluating Classifiers
Specialized Databases Revised by Micah Walsleben MLS Original by: Dawn Kruse Field, MSIS Milagros De Jesus Rivera, MLS June 2015.
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
RESEARCH A systematic quest for undiscovered truth A way of thinking
The role of management - How to organize meaningful and powerful environment of learning in the net.
Service Learning in International Contexts: A New Approach to Collaboration Cassie Quigley Assistant Professor Science Education.
1 On-Line Help and User Documentation  User manuals, online help, and tutorials are typically not used  However, well written and well-designed user.
Healthcare Drivers Quality of care – consistency, appropriateness Patient safety – diagnostic & therapeutic error rates Cost of care delivery Shortages.
Systematic Reviews.
Chapter 5 Job Analysis.
Electronic Portfolios Preparing Our Students for the 21 st Century The Future.
Providing Consultancy & Research in Health Economics Julie Glanville, York Health Economics Consortium, UK Anna Noel Storr, Cochrane Dementia and Cognitive.
Evidence Based Medicine Meta-analysis and systematic reviews Ross Lawrenson.
Managing Crowdsourcing Ventures Daren C. Brabham, Ph.D. University of North Carolina at Chapel Hill October 27, 2011 MBA 812: Strategic Communication &
JCDL 2012 Christo Dichev, Darina Dicheva Computer Science Department, Winston-Salem State University {dichevc, Is it Time to Change.
Introduction, or what is data mining? Introduction, or what is data mining? Data warehouse and query tools Data warehouse and query tools Decision trees.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Where did plants and animals come from? How did I come to be?
February February 2008 Evidence Based Medicine –Evidence Based Medicine Centre –Best Practice –BMJ Clinical Evidence –BMJ Best.
Analysing the 8 Stages of Guided Inquiry Activity: Each participant will be given one stage to become an expert on & create a summary resource.
The Four P’s of an Effective Writing Tool: Personalized Practice with Proven Progress April 30, 2014.
Applications in Acquisition Decision-Making Process.
CHAPTER 28 Translation of Evidence into Nursing Practice: Evidence, Clinical practice guidelines and Automated Implementation Tools.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
Hayat Mushcab, B.Sc. M.Sc. W. George Kernohan, PhD Jonathan Wallace B.A., M.Sc. Roy Harper, M.D, and Suzanne Martin, PhD THE JOURNEY TOWARDS SUCCESSFUL.
Researching and Writing Dissertations Roy Horn Researching and Writing Dissertations.
1 Monitoring and Evaluating Employee Wellness Programs.
What is Facilitation? Facilitation is the process of taking a group through learning or change in a way that encourages all members of the group to participate.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Your Name… An image of your power animal goes here…
1 On-Line Help and User Documentation  User manuals, online help, and tutorials are typically not used  However, well written and well-designed user.
Finding, Evaluating, and Presenting Evidence Sharon E. Lock, PhD, ARNP NUR 603 Spring, 2001.
Chapter 9: Tapping the Crowd for Fast Innovation ISTO SIPILÄ.
Study & Learning Skills Learning new ways to learn.
Evidence Based Practice (EBP) Riphah College of Rehabilitation Sciences(RCRS) Riphah International University Islamabad.
13.4 Information and Data. Characteristics and Classifications of Information There are many ways in which information can be classified, this can be.
Crowdsourcing diligent search Maurizio Borghi Professor of Law Director, Centre for Intellectual Property Policy & Management Bournemouth University 3rd.
Community of Practice Health Service Delivery A model of knowledge management at the district level PHCPI, Global Stakeholders Workshop, Geneva, 6-8 April.
Sources of systematic reviews Arash Etemadi, MD PhD Department of Epidemiology and Biostatistics, Tehran University of Medical Sciences.
Crowdsourcing: How to Benefit from (Too) Many Great Ideas (Blohm et al., 2013) Olga Jemeljanova Joona Kanerva Niko Kuki Mikko Nummela Group
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Daren C. Brabham, Kurt M. Ribisl, Thomas R. Kirchner, Jay M. Bernhardt American Journal of Preventive Medicine Volume 46, Issue 2, February 2014, Pages.
Sampath Jayarathna Cal Poly Pomona
Project Transform Julian Elliott Australasian Cochrane Centre
Crowdsourcing: A New Work Style
The Open Seventeen Crowdsourcing Sustainable Development Rosy Mondardini Community & Partnerships Citizen Cyberlab.
Many hands make light work…
Welcome It’s our #MedLitBlitz @Mark2Cure.
Employer and HR Perspective
Seminole County H.O.P.E. Partnership between KAD Foundation and the Casselberry Senior Center Serving Hispanic Seniors 55+ throughout the County Community.
Presentation transcript:

Data and text mining workshop The role of crowdsourcing Anna Noel-Storr Wellcome Trust, London, Friday 6 th March 2015

What is crowdsourcing? “…the practice of obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees…” Image credit: DesignCareer

What is crowdsourcing? Knowledge discovery and management Brabham’s problem focused crowdsourcing typology: 4 types

What is crowdsourcing? Knowledge discovery and management Broadcast search Brabham’s problem focused crowdsourcing typology: 4 types

What is crowdsourcing? Knowledge discovery and management Broadcast search Peer-vetted creative production Brabham’s problem focused crowdsourcing typology: 4 types

What is crowdsourcing? Knowledge discovery and management Broadcast search Peer-vetted creative production Distributed human intelligence tasking Brabham’s problem focused crowdsourcing typology: 4 types

What is crowdsourcing? Knowledge discovery and management Broadcast search Peer-vetted creative production Distributed human intelligence tasking Brabham’s problem focused crowdsourcing typology: 4 types

Micro-tasking: process Breaking down large corpus of data into smaller units and distributing those units to a large online crowd “the distribution of small parts of a problem”

Human computation Humans remain better than machines at certain tasks: e.g. Identifying pizza toppings from a picture of a pizza e.g. “preventing obesity without eating like a rabbit”.ti. – autotag: Animal study

Tools and platforms What platforms and tools exist and how do they work? Image credit: ThinkStock

The Zooniverse “each project uses the efforts and ability of volunteers to help scientists and researchers deal with the flood of data that confronts them”

Classification and annotation Galaxy Zoo Operation War Diary

Health related evidence production Can we use crowdsourcing to identify the evidence in a more timely way? -Known pressure point within the review production -Between 2000 and 5000 citations per new review, but can be much more -A not much loved task Trial identification

The Embase project Cochrane’s Central Register of Controlled Trials: CENTRAL Embase Crowd Embase auto Step 2: Use a crowd to screen thousands of search results from Embase and feed the identified reports of RCTs into CENTRAL How will the crowd do this? Step 1: run a very sensitive search in the largest biomedical database for studies

The screening tool Three choice s You are not alone! (and you can’t go back) Progress bar Yellow highlights to indicate a likely RCT Red highlights

The Embase project: recruitment people have signed-up to screen citations in 12 months - 110,000+ citations have been collectively screened - 4,000 RCTs/q-RCTs identified by the crowd

Why do people do it? Made it very easy to participate (and equally easy to stop!) Gain experience (bulk up the CV) Provide feedback: both to the individual and to the community Wanting to do something to contribute (healthcare is a strong hook) (people are more likely to come back)

RCT Reject Unsure CENTRAL Bin Resolver How accurate is the crowd? RCT Reject Resolver 5%

Crowd accuracy TP 1565 FP 9 FN 2 TN 2888 TP 415 FP 5 FN 1 TN 2649 The Crowd: INDEX TEST The Crowd: INDEX TEST The Info specialist: REFERENCE STANDARD The Info specialists: REFERENCE STANDARD Validation 1 Validation 2 Sensitivity: 99.9% Specificity: 99.7% Sensitivity: 99.8% Specificity: 99.8% Enriched sample; blinded to crowd decision; dual independent screeners as reference standard Enriched sample; blinded to crowd decision; single independent expert screener (me!) as reference standard; possibility of incorporation bias Individual screener accuracy is also carefully monitored

How fast is the crowd? Number of weeks Jan 2014Jul 2014Jan weeks 5 weeks 2 weeks More screeners and more screeners screening more quickly Length of time to screen one month’s worth of records

More of the same, and more tasks As the crowd becomes more efficient, we plan to do two things: 1.Increase the databases we search – feed in more citations 2.Offer other ‘micro-tasks’ Feed in more citations – from other databases Bin Y N Screen Annotate, appraise And in these tasks the machine plays a vital and complementary role… e.g. is the healthcare condition Alzheimer’s disease? Y, N, Unsure

Perfect partnership Machine driven probability + Collective human decision-making It’s not one or the other, the ideal is both

In summary Effective method in large scale study identification Identify more studies, more quickly No compromise on quality or accuracy Offers meaningful ways to contribute Feasible to recruit a crowd Highly functional tool Complements data and text mining And enables the move towards the living review Crowdsourcing: