Presentation on theme: "Electronic health records: interdisciplinary research and opportunities Prof Jackie Cassell, Dr Elizabeth Ford Division of Primary Care and Public Health."— Presentation transcript:
Electronic health records: interdisciplinary research and opportunities Prof Jackie Cassell, Dr Elizabeth Ford Division of Primary Care and Public Health Brighton and Sussex Medical School
2 Outline Introduction to CPRD Introduction to Farr Institute, £40M consortium Previous Wellcome Trust funded programme of work Current work on free text Potential for collaboration
4 Structure of CPRD database Consultation data Information from each consultation stored in different record tables Clinical records Read code for symptoms and /or diagnosis Test records Read code for test type +/- results Referral records Read code for symptoms / diagnosis plus specialty Prescription records Code for prescribed product Patient data Age, sex and (subsample only) deprivation score Practice data Deprivation score, urban /rural location and NHS region Linkages
5 Examples of Read codes Level 1Level 2TypeDescription IvCEmergency appendicectomy M18z.12F0CCItch 13HV4Y7DljCSeven year itch - marital 13HV3000mtCSpouse committed adultery 13HV3140mxCOil rig wives syndrome T5460Y7493OSucked into aircraft jet, without accident to aircraft, occupant of spacecraft injured T5468Y748vOSucked into aircraft jet, without accident to aircraft, member of ground crew or airline employee injured
6 Diagnosis Event noFree text pvte. Arranged CT scan, aiming for a laparotomy on 9th March. ~~~~~ ~~~~~~~~~ abdomen and pelvis with contrast - there is a large lobulated mixed density mass arising out of the pelvis. The appearances are those of an ovarian neoplasm. There is a large volume of ascites. No visible liver metastases. THe kidneys are not obstructed. In the left side of the middle abdomen at he lower pole of the left kidney there is a 3 x 2cm lesion in the anterior abdomen which looks like a solid mass of omentum ie omental metastases. there is no retroperitoneal lymphadenopathy. There is some dilatation of the left iliac vein secondary to mass effect of the tumour at the pelvic inlet, which may predispose the patient to a left leg DVT, and removal of pelvic mass and omentectomy Mr ~~~~~~~~ no vomiting. to start pre op chemo no doubt she has advanced carcinoma of the ovary Hoping to perform paracentesis tomorrow to get histological diagnosis. Taking films to ~~~~~~~~~~~ thinks treatment here would be cytotoxics in the first instance and possible surgery at a later date. ~~~~~~ remove staples at the Practice
7 Distress Event noFree text not taken amit yet worried re side effects long chat re family etc going back over all the neg things that have happened in her life re amitriptyline and sleeping. Not much sleep on 10mg so suggested could go upto mg nocte. RV PRN worse no confidence thought of suicide no firm plan would not do it poor app poor sleep thinks worse on amit try paroxetine see 2wks private no given for ~~~ with pt and husband. Relating situation re: trying to move house, very stressful, thrown up lots of issues from the past (ex husband violent drinker etc) and now having panic attacks and afraid to leave house, also diff sleeping. Nil suicidal ideation. Plan: reassured SSRI worth continuing, will add in Trazodone for night time. RV PRN counselling appointment no longer needed will contact if she decides different ap
8 Administrative Event noFree text appeal form Citizens Advice regarding incapacity benefit CITIZENS ADV BUREAU - INV MAX CHG awaiting TAH / OOPHORECTOMY TO HAVE CT SCAN L.F TESTS Warning: Last BP Record > 3 months old Warning: Last BP Record > 3 months old £ CAB op mane background retinopathy Warning: Last Smoking Status Record > 12 months old Warning: Last Smoking Status Record > 12 months old.
The ergonomics of electronic patient records A collaborative project Funded by the Wellcome Trust
12 What determines the balance between free text and coded data in primary care? What determines completeness of recording in primary care? How does free text coding variation affect data accessibility for users of electronic patient records (particularly health researchers)? How can we to represent NLP extracted free text to users (particularly health researchers)?
13 3 major strands, 2 clinical questions How much can automated free text processing contribute to changing estimates of rheumatoid arthritis prevalence in a practice population, and thus facilitate the delivery of active management? To what extent do early clinical manifestations of ovarian cancer appear in free text records alone, and how far could this impact on estimates of delay between first presentation and diagnosis?”
14 What kind of questions are we trying to answer now?
15 Quality issues Almost all studies published use codes only False positives – case detecting algorithms False negatives - ? Machine Learning Free text
16 Ongoing or planned projects The onset of Rheumatoid Arthritis: How much information is hidden in free text? Applying astronomical data analysis to electronic primary care patient records to facilitate timely diagnosis of Dementia (submitted to STFC CLASP panel 2/12/14)
17 Rheumatoid Arthritis
18 85 men, median age 65 years (IQR 55-75y) 209 women, median age 63 years (50-74y) 34,738 events recorded during the study period 4,340 text events related to RA
19 Text: How much, Where & When? Around 10-15% of RA patients had free text for disease information when they did not have a code (e.g. tests, referrals, preliminary diagnoses) 25% of disease information found under the top 3 codes, –letter from specialist –patient reviewed –seen in rheumatology clinic Specific disease information found in text closer to the diagnosis (specific diagnosis, +ve test results; 1-2 mths), Less specific info further away (pain, stiffness and swelling; 4 mths prior to diagnosis).
20 Syntactical Issues Vomiting || since waking this morning. No Haemoptysis. O/E - looks pale and unwell. Abdomen soft., bowel sounds normal. had a motion this mornining / normal Stop prednisolone and Indometacin. Double omeprazole and take gaviscon QDS. do HB / ESR / CRP to check on progress of arthritis symptoms of arthritis compeletely disapeared. no past history of indigestion Abbreviations Unpredictable punctuation Spelling mistakes Discourse connectives omitted (e.g. but, and, however)
21 Modelling the context there is no signs ofinflammation Mother hasdiabetes sarcoidosis Previous history of painful joints He denies any rheumatoid arthritis Symptoms suggest
22 Challenges for Natural Language Processing of medical text Extracting clinical information from breast pathology reports (76,333 reports): –124 ways of saying invasive ductal carcinoma –95 ways of saying invasive lobular carcinoma. –33 types of negation –A potential 4000 ways of saying invasive ductal carcinoma was not present Buckley et al., 2012 J Pathol Inform
23 Dementia Great public health challenge of our era Leading cause of death in women (ONS, 2014) “Diagnosis gap” in general practice with under half of the expected numbers of patients with dementia diagnosed
24 Dementia diagnosis code Memory loss symptom Accidental fall 3-5 years prior to code Dementia annual review 12 months after code Time Low mood Memory test Referral to memory clinic How to find dementia patients without a code?
25 In the pipeline… What influences GPs’ use of codes and free text when recording consultations? –Types of disease – including stigma –Demographics – including BME, sexuality and age –Availability of treatment – including clinical nihilism –Work place determinants – including financial and time constraints –System determinants – ergonomics of the computer system
26 Why do clinicians use codes? Benefits of Codes Ease of access, search & retrieval Targets & financial incentives Evidence of appropriate care provided Ease of abstraction for billing/patient review/referrals Coherence of format/terms between one record and another
27 Why do clinicians use free text? Benefits of Text Narrative is more engaging (next clinician) Allows nuance and expression Captures complexity, evolving circumstances and severity Allows for uncertainty in diagnosis Decisions more easily defensible in medico-legal situations Cognitive load reduced / time required to find code
28 In the pipeline… De-identification of Text Data –Can we produce an evidenced methodological framework to establish criteria for safe and standardised de-identification of medical text? –What standards of privacy protection for medical text data are acceptable to patients, users, and state authorities? –How can we assess these standards?
29 Where next Ethics Governance and reputation Policy History Culture and context Linguistics Machine learning Techniques for extracting content NLP
30 Our team Prof Jackie Cassell (Epidemiology) Dr Liz Ford (Epidemiology) Prof Helen Smith (Primary Care) Prof Kevin Davies (Rheumatology) Prof John Carroll (Informatics) Dr Rosemary Tate (Informatics) Dr Novi Quadrianto (Informatics) Our Partners CPRD Farr CIPHER UCL Research Department of Primary Care and Population Health