Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic Report Generation from Ontologies: the MIAKT Approach Kalina Bontcheva, Yorick Wilks Department of Computer Science University of Sheffield.

Similar presentations


Presentation on theme: "Automatic Report Generation from Ontologies: the MIAKT Approach Kalina Bontcheva, Yorick Wilks Department of Computer Science University of Sheffield."— Presentation transcript:

1 Automatic Report Generation from Ontologies: the MIAKT Approach Kalina Bontcheva, Yorick Wilks Department of Computer Science University of Sheffield

2 Rationale NLG takes as input structured data in a knowledge base or ontology and produces natural language text Applied to provide automatic documentation of ontologies or generate textual reports from formal knowledge Keeps texts constantly up-to-date so they reflect changes in the ontology

3 The MIAKT project Medical Imaging and Advanced Knowledge Technogies Breast cancer Triple assessment process –Oncologist – clinical assessment –Hystopathologist – cytology –One or more radiologists – X-ray mammograms, MRI scans –Surgeon –Sometimes radiographer Types of images –Mammograms, MRI scans, ultrasound…

4 The MIAKT Demonstrator

5 Semantic Image Annotation

6 The Domain Ontology

7 Generation Service Input

8 Generation Service Output

9 Generation Architecture

10 Removing Repeating Triples Based on the ontology – inverse properties … involved_in_ta(01401_patient, ta-soton-1069)  involve_patient(ta-soton-1069, 01401_patient) More complex reasoning will be required to detect facts entailed by already said facts

11 Discourse Planning Schemas – capture regular patterns in the domain; can be applied recursively Describe-Patient -> Patient-Attributes, Describe-Procedures Patient-Attributes -> [attribute(Patient, Attribute)], Patient-Attributes *

12 The Property Hierarchy Special linguistically-motivated properties were introduced to make the NLG modules more generic: –active-action (e.g. involve_patient) –passive-action (e.g., involved_in_ta) –Attribute (e.g. has-age, has-size) –part-whole (e.g., consists-of) All properties from the ontology were made sub- properties of one of these 4 More light-weight approach than having a complete linguistic ontology like GUM (Generalised Upper Model)

13 Ontology-Based Aggregation Joining attribute and part-whole properties with the same first argument to have more coherent sentences ATTR(Abnormality: 01401, Mass: 01401_mass) ATTR(Abnormality: 01401, Margin: i_m_microlob) ATTR(Abnormality: 01401, Shape: i_shape_round) ATTR(Abnormality: 01401, Diagnose: i_pr_malig) Without aggregation: The abnormality has a mass. The abnormality has a microlobulated margin. The abnormality has a round shape. The abnormality has a probably malignant assessment. With aggregation: The abnormality has a mass, a microlobulated margin, a round shape, and a probably …

14 Surface Realisation The input is an RDF statement and the concept which is going to be the subject of the sentence: ATTR(Abnormality: 01401, Mass: 01401_mass) + Abnormality: 1401 ATTR and PART_OF relations are handled already by an existing realiser (HYLITE) which treats the RDF as a graph and finds a path through it, starting from the focused concept Active and passive action properties are mapped to semantic roles like OBJ, PTNT, AGNT AGNT(Mammography: 01402, PRODUCE_RESULT) OBJ(PRODUCE_RESULT, Med_Image: 01402_left_cc)

15 Domain Portability Availability of lexical resources for the domain, e.g. UMLS and SPECIALIST or a lexicalised ontology The classification of the properties into the 4 linguistic ones – possible to do semi-automatically if there are good naming conventions The 4 linguistic properties may have to be extended to include others if the domain requires it The main effort will be in the text structuring patterns, which require significant understanding of the system in order to modify them Machine learning to induce text patterns from labelled examples

16 Conclusion Presented an approach for automatic generation of texts from ontologies MIAKT exploits information from the ontology in order to filter out repetitive information and group together similar facts Main contribution is in showing how NLG tools can be designed to be easily customisable by non-specialists (through GUI tools) New application: sekt.semanticweb.org

17 Further Info ers.html

18 The MIAKT lexicon Currently contains 320+ terms lexicalising: –76 concepts –153 instances in the MIAKT ontology Created manually from: –BI-RADS and NHS documents –Online papers and Medline abstracts to verify and enrich the term entries with synonyms


Download ppt "Automatic Report Generation from Ontologies: the MIAKT Approach Kalina Bontcheva, Yorick Wilks Department of Computer Science University of Sheffield."

Similar presentations


Ads by Google