BabyTalk: Generating English Summaries of Clinical Data

Slides:

Advertisements

Similar presentations

Assessment 2014.

Advertisements

Dr. Ehud Reiter, Computing Science, University of Aberdeen1 NLG Shared Tasks: Lets try it and see what happens Ehud Reiter (Univ of Aberdeen)

1 © 2006 Curriculum K-12 Directorate, NSW Department of Education and Training Implementing English K-6 Using the syllabus for consistency of teacher judgement.

SECOND MIDTERM REVIEW CS 580 Human Computer Interaction.

Understanding Progress in English A Guide for Parents.

DECISION SUPPORT SYSTEM ARCHITECTURE: THE MODEL COMPONENT.

WORKING TOGETHER ACROSS THE CURRICULUM CCSS ELA and Literacy In Content Areas.

1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.

Consistency of Assessment

Time Series Analysis – Example Application. 2 Scuba Scuba – Self Contained Under-water Breathing Apparatus Scuba diving – popular form of recreational.

Common Core State Standards Professional Learning Module Series

Geography Subject leaders Training Exploring the content of the new National Curriculum.

Writing level 3 essays An initial guide. Key principles The key principles of essay writing still apply: Understanding the topic Plan your response Structure.

What would I tell the staff? Literacy PD with Ken Kilpin Thursday 22/08/2013.

Journaling in Math: Relevant? Useful? presented by Donna McLeish to Rockville Elementary School Teachers January 18, 2005.

RSBM Business School Research in the real world: the users dilemma Dr Gill Green.

Dr. MaLinda Hill Advanced English C1-A Designing Essays, Research Papers, Business Reports and Reflective Statements.

Key Stage 1 SATs Parent Information Meeting. The National Curriculum All maintained schools must follow the National Curriculum by law. It consists of.

BSBIMN501A QUEENSLAND INTERNATIONAL BUSINESS ACADEMY.

COMP 208/214/215/216 Lecture 2 Teams and Meetings.

An approach to Intelligent Information Fusion in Sensor Saturated Urban Environments Charalampos Doulaverakis Centre for Research and Technology Hellas.

1 Making sound teacher judgments and moderating them Moderation for Primary Teachers Owhata School Staff meeting 26 September 2011.

Chapter 5: Requirement Engineering Process Omar Meqdadi SE 2730 Lecture 5 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.

Geography 2c To use globe, maps and plans at a range of scales. 2d Use secondary sources of information. 2e Make maps and plans. 3a Identify and describe.

Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics.

Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.

The New English Curriculum September The new programme of study for English is knowledge-based; this means its focus is on knowing facts. It is.

THE RADIO SCRIPT Writing radio packages Image by Media Helping Media available under Creative Commons.

Natural Language Generation and Data-To-Text

I2B2 Shared Task 2011 Coreference Resolution in Clinical Text David Hinote Carlos Ramirez.

Lecture 1: Introduction Advaith Siddharthan (Course Coordinator) Kees van Deemter Ehud Reiter Yaji Sripada Background read: Reiter and Dale, Building Natural.

How to read a scientific paper

CHAPTER 10: CORE MECHANICS Definitions and Mechanisms.

Observing Users (finishing up) CS352. Announcements, Activity Notice upcoming due dates (web page) Discussion: –Did your observations have enough detail.

Hazard Identification

Individual Differences in Human-Computer Interaction HMI Yun Hwan Kang.

QM Spring 2002 Business Statistics Probability Distributions.

Building Simulation Model In this lecture, we are interested in whether a simulation model is accurate representation of the real system. We are interested.

Grade Book Database Presentation Jeanne Winstead CINS 137.

1 Report Writing Report writing. 2 Contents What is a report? Why write reports? What makes a good report? Fundamentals & methodology »Preparation »Outlining.

ASSESSMENT ISSUES INCLIL. ASSESSMENT PROCESSES SUMMATIVE SUMMATIVE Makes a judgement on the capability of the learner at a certain point in time Makes.

On 15 October The Social Network, a film telling the story of Facebook, is released in the UK.

Textbook Recommendation Reports. Report purpose u Starts with a stated need u Evaluates various options –Uses clearly defined criteria –Rates options.

1 Viewing Vision-Language Integration as a Double-Grounding case Katerina Pastra Department of Computer Science, Natural Language Processing Group, University.

Teaching Writing.

Ehud Reiter, Computing Science, University of Aberdeen1 CS4025: Content Determination and Document Planning.

BY DR. HAMZA ABDULGHANI MBBS,DPHC,ABFM,FRCGP (UK), Diploma MedED(UK) Associate Professor DEPT. OF MEDICAL EDUCATION COLLEGE OF MEDICINE June 2012 Writing.

Approach to building ontologies A high-level view Chris Wroe.

Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.

SAT’s Information Parent’s Meeting 10 th February February 2016.

Software Engineering, COMP201 Slide 1 Software Requirements BY M D ACHARYA Dept of Computer Science.

Information, Data & Communication Part One. Data and Information Defined The terms “data” and “information” are used interchangeably in every day speech.

* Statutory Assessment Tasks and Tests (also includes Teacher Assessment). * Usually taken at the end of Key Stage 1 (at age 7) and at the end of Key.

GOING DEEPER WITH INDEPENDENT READING AND FURTHER THAN INDEPENDENT READING.

CCT 333: Imagining the Audience in a Wired World Class 6: Intro to Research Methods – Qualitative Methods.

TKT COURSE SUMMARY UNIT –14 Differences between l1 and l2 learning learners characteristics LEARNER NEEDS DIANA OLIVA VALDÉS RAMÍREZ.

 An important first quality of any good thesis is that it should stem from real problems in the field. Therefore, a researcher should emphasize the reasons.

Highland Community School District Instructional Rounds October 15, 2009.

Key Objectives: Year 1 Reading. How can parents support learning? Reading Read with your child every night. Ask questions to extend their understanding.

Investigate Plan Design Create Evaluate (Test it to objective evaluation at each stage of the design cycle) state – describe - explain the problem some.

Academic Writing Fatima AlShaikh. A duty that you are assigned to perform or a task that is assigned or undertaken. For example: Research papers (most.

Ehud Reiter, Computing Science, University of Aberdeen1 CS5545: Natural Language Generation Background Reading: Reiter and Dale, Building Natural Language.

WP4 Models and Contents Quality Assessment

Pepper modifying Sommerville's Book slides

What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 14th May 2018.

Information for Parents on Key Stage 2 SATs

What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 13th May 2019.

What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 13th May 2019.

Aims of the meeting To inform you of the end of Key Stage 2 assessment procedures. To give you a better understanding of what’s involved in the SATs tests.

EBPS Year 6 SATs evening.

Presentation transcript:

BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept

Structure Background: data-to-text Babytalk project Results of first evaluation Current work

What is data-to-text Goal: generate English summaries of non-linguistic data Numerical weather predictions Medical records Statistics Etc

Simple Example: Weather Forecasts Input: numerical weather predictions From supercomputer running a numerical weather simulation Output: textual weather forecast We’ve developed several systems Two used commercially (oil rig, road gritting) Users prefer some gen texts to human texts! Demo of pollen system on our webpage So have others (FoG, MultiMeteo, …)

Pollen forecasts Grass pollen levels for Tuesday have decreased from the high levels of yesterday with values of around 4 to 5 across most parts of the country. However, in South Eastern areas, pollen levels will be high with values of 6.

Other data-text apps Medical: to-be-discussed Assistive technology: help blind people access statistical data Financial: summarise stock-market data Education: Summarise assessment results, help write stories Engineering: Sum. gas-turbine data Etc

Why is data-to-text useful The world is drowning in data NLP researchers talk about problems of too much text, but data problems are worse Texts are at least read by someone (writer) Most data is automatically collected and never looked at by a human

Data overload Sensor recording 2 bytes/second Simulations 170KB/day 63MB/year Millions of sensors in hospitals, jet engines, … Simulations Weather: 30MB for one day in one UK county, from one model Climate models: petabytes of data Too much data, need better tools for utilising!

Decision Support Data often used for decision support Medical: help doctors make decisions Weather: helps staff on offshore oil rigs plan their operations Engineering: help plan maintenance Etc Often under time pressure Make a decision in 3 min, here is 30MB of data to help you

Using data for decision support Alarming Trigger alarm if value exceeds threshold Or other such simple rule Works, doesn’t get full value from data Visualisation Show data to experts visually People like this, unclear how much it helps, especially when massive amount of data

Using data for decision support Knowledge-based systems Feed data into an expert system which makes recommendations based on it Can work in some contexts, but problems Domain experts dislike being told what to do Often key data not available to KBS Can be brittle, fragile

Data-text for decision support Idea: use KBS, NLP tech to generate a short text summary of a data set Intermediate between KBS and visualisation Use domain reasoning to highlight key info, infer causal links, add background know But stick to describing data, don’t tell experts what to do!

Data-text for decision support vs alarms: deeper info vs visualisation Just key facts, not everything Supplemented with causal links, etc vs KBS More acceptable to users More robust, since not useless if missing some key data or knowledge

Data-text for decision support Above is still somewhat speculative But people in many domains are interested in exploring the concept to see if it works Esp since current situation is so bad! Of course other uses of data-to-text Assistive technology, education

Language and World How does language relate to the world? Data-to-text is a great way of exploring this The real reason I got into this…

BabyTalk Goal: Summarise clinical data about premature babies in neonatal ICU Input: sensor data; records of actions/observations by medical staff Output: multi-para texts, summarise BT45: 45 mins data, for doctors (completed) BT-Nurse: 12 hrs data, for nurses BT-Family: 24 hrs data, for parents BT-Clan: 24 hrs data, for other friends, family Bt-Doc: several hrs data, for doctors

Neonatal ICU

Peripheral Temperature (TP) Baby Monitoring SpO2 (SO,HS) ECG (HR) Peripheral Temperature (TP) Core Temperature (TC) Transcutaneous Probe (CO,OX) Arterial Line (Blood Pressure)

Input: Sensor Data

Input: Action Records FullDescriptor Time SETTING;VENTILATOR;FiO2 (36%) 10.30 MEDICATION;Morphine 10.44 ACTION;CARE;TURN/CHANGE POSITION;SUPINE 10.46-10.47 ACTION;RESPIRATION;HAND-BAG BABY 10.47-10.51 SETTING;VENTILATOR;FiO2 (60%) 10.47 ACTION;RESPIRATION;INTUBATE 10.51-10.52

BT45 texts Human corpus text At 1046 the baby is turned for re-intubation and re-intubation is complete by 1100 the baby being bagged with 60% oxygen between tubes. During the re-intubation there have been some significant bradycardias down to 60/min, but the sats have remained OK. The mean BP has varied between 23 and 56, but has now settled at 30. The central temperature has fallen to 36.1°C and the peripheral temperature to 33.7°C. The baby has needed up to 80% oxygen to keep the sats up. Computer-generated text By 11:00 the baby had been hand-bagged a number of times causing 2 successive bradycardias. She was successfully re-intubated after 2 attempts. The baby was sucked out twice. At 11:02 FIO2 was raised to 79%.

Babytalk architecture Signal analysis: patterns, trends Data interpretation: based on medical knowledge (like expert sys) Doc planning: select and structure events to be mentioned Microplanning: choose words, syntactic structures, referring exp Realisation: generate actual text

Signal Analysis Detect trends, patterns, events, etc Detect artefacts Blood oxygen levels increasing Downward spike in heart rate Detect artefacts Changes due to sensor problems Plenty of algorithms exist for this Will not further discuss here

Data Abstraction Detect higher-level events in the data Sequence of bradycardias (downward spikes in HR) Determine medical importance Bradycardia more important if simultaneous desaturation (downward spike in SO) Medical KBS

Data Abs: Links Between Events Infer links between events Blood O2 falls, therefore O2 level in incubator is increased HR up because baby is being handled Morphine given as part of the intubation procedure Very imp, much of value added of text Helps readers build good mental model of what is happening to the baby

Document Planning First NLP stage Decide what events to mention Decide how these are ordered and organised

Content Determination First approach: Include most medically important events Also include moderately important events which are linked to very important events Doesn’t always work

Problem: Continuity Omitting intermediate events confuses readers Example: TcPO2 suddenly decreased to 8.1. SaO2 increased to 92. TcPO2 suddenly decreased to 9.3 There is a gradual rise in TcPO2 between the sudden falls This is less important medically But important for reader’s comprehension

Document Structure How do we order/group events By time By medical importance By body subsystem (eg, respiration) Initially focused on time, but users want more emphasis on subsystem Eg, first a “scene” about respiration, then a “scene” about thermoregulation Not constant shifting between two

Doc Planning: Narrative High-level analysis: need to do a better job of generating a “story” from the data Link events together Include events needed for story progression even if not important “Scene” structure Qualitative observation by users

Microplannig Second NLP stage Choose words and syntactic structure to express information Aggregation Reference

Challenge: Time Need to communicate temporal info Enough so that readers can interpret the data Not too much, text becomes unreadable Imagine story with “At 10.14 John left home. At 10.28 he met Mary in the pub. At 10.39…”

Tenses Use Reichenbach model Usually worked, sometimes failed Speech time: time of report being read Event time: time of event being described Reference time: determined using a salience model Similar to resolving anaphoric reference Usually worked, sometimes failed Need better model for reference time

What does event time mean? Sometimes explicit time given for event Supposed to be start time of event, sometimes misinterpreted Ex:”After three attempts, at 13.53 a peripheral venous line was inserted successfully.” 13.53 refers to time of first (failed) attempt Start of LINE-INSERT-ATTEMPTS event Readers interpret as time of final (succ) attempt Need better linguistic model of time Linguistic temporal ontology (Moens Steedman)?

Lexical Choice Need mechanism to map domain events (instances in a Protégé ontology) to linguistic structures Use JESS rules Lexical info from Verbnet, NIH lexicon Engineering challenge Relate to Sheffield work on NLG/ontologies

Vague language Human texts are full of vague language Ex: There is a momentary bradycardia What does “momentary” mean? Our models of this are very crude, need to be improved!

Realisation Last NLG stage Generate actual text, once choices made Use Aberdeen simplenlg package Will not further discuss here

BT45 Evaluation Showed 35 medical professionals 24 scenarios in 3 conditions (8 of each) Visualisation of medical data Textual summary (manually written) Textual summary (from BT45) Asked to make a treatment decision Limited to 3 minutes Measured correctness (against gold stan) Off-ward, using historical data So no other knowledge about baby

Free-text comments Comments were not solicited, but were recorded if made Most important were Better layout (eg, bullet lists) Continuity (as mentioned before)

Decision-Support results No sig difference in time taken Avg decision-quality (scale -1 to 1) Human texts: 0.39 Computer texts: 0.34 Visualisation: 0.33 Human sig better than comp, visual No sig diff comp, visual

Results by subject type Analysis by type of subjects Human texts especially good for junior nurses (ie, least experienced subjects)

Results by scenario Each scenario had a main target action 8 different ones Computer texts as good as human texts for five of these; worse for three No action, manage temperature, monitor equipment These relate to specific problems in the system, which can be fixed

Target Actions with Poor Perf No action: Needs high-level summary, not blow-by-blow event description Manage Temperature: Two temp channels, need to describe together Monitor equipment: Need to mention (not ignore) sensor artefacts

Summary Good performance with human texts shows textual presentation is effective Also seen in previous study Babytalk as good as visualisation, could make better by addressing above issues Even now giving users BabyTalk text as supplement to visualisations could help

Current Work BT-Nurse: shift summaries for nurses Use live data from current babies Evaluate on ward, using babies that subjects (nurses) actually looking after Focus on info relevant to nurse shift planning, not real-time decision support Longer time period (12 hrs) Need more sensor abstraction Longer texts (multi-page)

Current Work BT-Family: information for parents Estimate how stressed parents are, use this to control content, phrasing High stress means less content Relate to Sheffield work on personality?? Express information in language which parents can understand, not medicalese

Current Work BT-Clan: Information for friends, family Social networking perspective: encourage useful support, minimise hassle of dealing with numerous inquiries Parents decide what to tell people Intentional deceit: if granny is frail, don’t tell her bad news Info about parents as well as baby

Research agenda Detecting complex events in the data Integration with medical guidelines Better use of vague language Better stories Role of text in interactive multimodal information presentation system Try in domain of assisted living