1 XML in Healthcare and the Semantic Web Jonathan Borden, M.D. Center for Brain and Cranial Diseases St. Vincent Health System, Erie PA Invited Expert, W3C Web Ontology Working Group Chair, ASTM E31.28 Electronic Healthcare Records

2 The Goal zAnswer questions like: zOf all the patients I operated on for brain tumors between , matching severity of pathology and matching clinical status and who have the P53 mutation, did PCV chemotherapy improve the cure rate at five years?

3 Healthcare: The current situation zA disaster: 1.1 Trillion $/year in the USA z30-40 % overhead zmostly paper based zhighly proprietary commercial systems ztens of thousands of people die each year due to poor information/errors zMost of the information is rendered useless

4 Strategies zDefine open standards zCapture information in an electronic form zReduce errors related to information zDefine distributed, web enabled, query models

5 Tactics zXML, schemas, query model zSemantic Web/URI graphs zData analysis based on actual population rather than small, potentially biased, samples zGoogle for biomedical information

6 Why XML? zWidely implemented with excellent open source tools zLife of data is longer than life of application zData driven, Platform independent zFormal schema and query models

7 Reinventing medical informatics zGet the data format right and the rest will follow zStructured information has been the holy grail of medical informatics for the last 30+ years zXML is the culmination of 30+ years of work in structured information zTime to do something

8 XML Briefly zSimplification of SGML … markup language for the web z content z y z

9 XML and Infosets z y x James Steven x Smith 3rd y zstartElement(patient) ystartElement( xstartElement(given);characters(James);...

10 Regular Expressions zPattern matching z*TATA* zbp ::= G | T | A | C ztata ::= bp*, T, A, T, A, bp*

11 XML DTD z

12 Tree Regular Expressions element foo{ element bar{ attribute bop[int] element baz{xxx} } xxx

13 ASTM E2182/E2183 zXML DTDs for Healthcare zEmphasize Human Readability zFlexibility zOpenhealth reference implementation zCompatible with HL7 CDA

14 ASTM Healthcare DTDs zclinical.header ycompatible with HL7 CDA zclinical.body yspecific to document type ydischarge.summary etc.

15 ASTM E31.28 Clinical Header ch.person.type =, id*, addr* ch.organization.type =, id*, addr* clinical.header = element clinical.header{ ch.attrib, id*, version.number?, confidentiality.code*, patient.encounter?, authenticator*, legal.authenticator*, intended.recipient*, originator?, originating.organization?, transcriptionist?, provider+,*, patient, events?, codes?, related.document* }

16 ASTM E31.28 Clinical Header = element { ch.attrib, xlink.attrib?, (|, id*, addr*, type.code?, function?, date.time? } provider = element provider{ ch.attrib,, function?}

17 ASTM E31.28 Clinical Header patient.encounter = element patient.encounter{ ch.attrib, (id? & practice.setting? & date.time? & location) } = & & gender? patient = element patient { ch.attrib,xlink.attrib?, }

18 Encounter z y … z

19 XML examples z y x Ms. x Susan x Samantha x Jones y y

20 XML examples z y … y z y Dr. Amanda Smith z

21 Using XML to generate reports zBrowser form zASTM E2182 XML format zXSLT transform for display in browser zXSL-FO transform for printable form (e.g. PDF)



24 ASTM Opnote: Header (1/3) ENC Operation New England Medical Center Dr. Jonathan Alan Borden M.D....

25 ASTM Opnote: Header (2/3) … Washington Street Boston MA Attending Surgeon...

26 ASTM Opnote: Header (3/3) … John Q Doe Jr XXX.21

27 ASTM Opnote: Body Right Frontal Brain Tumor same, probable Astrocytoma Right Frontal Craniotomy for Excision of Brain Tumor GETA The patient presents with severe headaches and blurred vision. An MRI demonstrates a large cystic irregularly shaped mass within the right frontal lobe. The patient had application of the external fiducial markers and was brought down to the MRI suite where a head MRI was obtained using the frameless stereotactic (3D) protocol. The image set was transferred using the DICOM protocol cc Stable, extubated SICU




31 How it works Browser Apache XSLT Servlet engine xml:db RDF

32 Form generation Form.xml Defaults.xml Formgen.xsl XML + XSLT => XHTML

33 Workflow zForm created zTransform into ASTM XML format zXHTML editing (opnote-edit.xsl) zSign finished product zRender as XHTML for viewing, printing z to Medical Records and Billing

34 Workflow generate edit sign Billing repository

35 Document analysis zLike gene sequences, it turns out that … zMedical documentation is highly repetitive zWith hot spots of unique information zSchema defines template filled with values zEasily expanded into HTML for human consumption zEasily analyzed by software

36 Document analysis

37 Integrating binary formats zMIME XMTP zHL7 V2 zX12 EDI zDICOM

38 Internet Telemedicine zThe OceanMed project, 1998 zMerchant vessel, access via satellite gateway zDigital camera zWeb based physician access


40 XMTP Consult 36 year old male has itchy rash for 6 days Hydrocortisone cream 1% to affected area t.i.d.| reply

41 How it works zMessages arrive in MIME format zMIME SAX parser converts to XML by SAX events zXMTP employs XML object model *not necessarily* serialization format -> zgrove processing

42 XMTP zFrom: zTo: zContent-type: multipart/related; charset=iso z zstartDocument() ystartElement(MIME) xstartElement(From) xendElement(From) xstartElement(Content-Type, attribute(charset,iso )) characters(multipart/related) xendElement(Content-Type)

43 The XMTP/MIME grove Content-type: text/plain From: To: Hi Sue! See you in Boston, Joe text/plain Hi Sue! See you in Seattle, Joe

44 The HL7 Grove zNon-XML syntax => XML Infoset zMSH|PAT|Jones^James^Stephen^3rd| startElement(patient) startElement( startElement(family) characters(Jones); endElement(family) … endElement( endElement(patient)

45 Simple building blocks zXML parsers zXSLT transform engines zHTTP clients and servers

46 From syntax to semantics zLayer 1: syntax y XML defines syntactic constrains on text yother specs define syntactic constraints on binary data zLayer 2: datatypes yintegers define mapping from lexical space to value space y10base10 -> 10, 10base2 -> 2

47 The shape of information syntax -> structure = semantics …..TATA….. gene tata snp Pattern matching transform

48 Semantics zLayer 3: hierarchy of classes ythe set of individuals of a given datatype or object type define a class zOntology: a description of a collection of classes, their properties and the relationships between them

49 Healthcare Ontology

50 RDF in Healthcare positive 100 The brain demonstrates areas of PML including viral inclusion bodies

51 RDF is... A standard syntax to represent (edge labeled) directed graphs in XML

52 DLG: Semantic Networks vertebrate mammal bird canaryostrich heart spine hair fly wings walk doesnt fly yellow isa has can freddiehugo

53 Semantic Networks zA way to represent natural language circa 1970s zA format for organizing statements in a way that can be queries by computers

54 Semantic Networks zCan freddy fly? zDoes hugo have wings? zDoes freddy have a spine? zOf all the canaries, how many live in cages?

55 RDF N-triples syntax Subject predicate object. ex:Freddy rdf:type ex:Canary. ex:Canary rdfs:subClassOf ex:Bird. ex:Freddy ex:color Yellow. Bird Canary yellow isa Freddie

56 RDF/XML syntax Yellow

57 RDF/XML syntax: typed Yellow

58 Semantic analysis zOf all the patients I operated on for brain tumors between , matching severity of pathology and matching clinical status and who have the P53 mutation, did PCV chemotherapy improve the cure rate at five years?

59 Web Ontology Language (OWL) zProblem (restated): "Tell me what wines I should buy to serve with each course of the following menu. And, by the way, I don't like Sauterne." zOWL is a language for defining Web ontologies and their associated knowledge bases.

60 Ontologies zOntology is a term borrowed from philosophy that refers to the science of describing the kinds of entities in the world and how they are related. In OWL, an ontology is a set of definitions of classes and properties, and constraints on the way those classes and properties can be employed.

61 OWL zincludes taxonomic relations between classes datatype properties, descriptions of attributes of elements of classes, object properties, descriptions of relations between elements of classes, zDatatype properties and object properties are collectively the properties of a class.

62 Simple Named Classes class, subClassOf Root classes: Every individual in the OWL world is a member of owl:Thing. sample wines domain, we create three root classes: Winery, Region, and ConsumableThing.

63 Simple Named Classes class, subClassOf wine vin...

64 Defining individuals is identical to

65 Grapes

66 Simple properties zObject Properties

67 Property hierarchy...

68 Domain and range...

69 Restrictions

70 Vintages 1

71 Datatype properties dt;wineYear ::= integer > 1700

72 Properties of individuals

73 Ontology mapping zsameClassAs zsameIndividualAs zsamePropertyAs

74 Complex constructs zDescription Logic yunionOf yintersectionOf ycomplementOf yoneOf ydisjointWith

75 Healthcare DL ontologies zOpenGALEN yOpen terminology yFrench Ministry of Health CCAM zSNOMED yClosed DL terminology

76 Simplified Healthcare Ontology

77 Simplified Healthcare Ontology

78 Healthcare Ontology

79 Putting it all together zBiomedical information has many vocabularies - each in its own namespace zgenetics Bio ML zpathology SNOMED zsurgery CPT zmedicine ICD zradiology DICOM

80 Putting it all together Electronic medical record genes diagnoses drugs procedures

81 genetics MRI Path-specimen person Gene: p53 Left temporal tumor SNOMED: glioblastoma OWL across schemas

82 Assimilating disparate information glioblastoma p Ring enhancing enhancing astrocytoma p53

83 UMLS next generation zOntologies exposed as OWL on web zCross references exposed as OWL on web zEnables searching for and reasoning about terms relating to eachother zEnables searching for and reasoning about terms from multiple terminologies

84 Semantic analysis repository instance Class Property domain type subClass Class type

85 Queries: several views zRegular expression pattern matching zQuery as universal/existential quantification (FOPL) zQuery as DL classification

86 First Order Predicate Logic (for-all ?pat (exists ?surgeon (last-name ?surgeon Borden)) (exists ?procedure (craniotomy ?procedure) (patient ?procedure ?pat) (surgeon ?procedure ?surgeon) (between (date ?procedure) ) (sequence ?procedure p53)...

87 Future directions zThe technology is here … zASTM E zDefine schemas and ontologies zStandardize data formats zCollect data zjust do it!

88 Contact Information Jonathan Borden, M.D. Center for Brain and Cranial Diseases St. Vincent Health System 311 W. 24th Street Erie, PA, (demo)

