Applying formal methods to clinical guidelines: the case of temporal information Paolo Terenziani Dipartimento di Informatica, Univ. del Piemonte Orientale.

Slides:



Advertisements
Similar presentations
Clinical Guidelines Adaptation: Managing Authoring and Versioning Issues Paolo Terenziani 1, Stefania Montani 1, Alessio Bottrighi 1, Gianpaolo Molino.
Advertisements

Extending Temporal Databases to Deal with Telic/Atelic Medical Data Paolo Terenziani 1, Richard T. Snodgrass 2, Alessio Bottrighi 1, Mauro Torchio 3, Gianpaolo.
From Handbook of Temporal Reasoning in Artificial Intelligence By Jan Chomicki & David Toman Temporal Databases Presented by Leila Jalali CS224 presentation.
Exploiting decision theory for supporting therapy selection in computerized clinical guidelines Stefania Montani, Paolo Terenziani, Alessio Bottrighi DI,
Chapter 15: Transactions Transaction Concept Transaction Concept Concurrent Executions Concurrent Executions Serializability Serializability Testing for.
Outline  Introduction  Background  Distributed DBMS Architecture  Distributed Database Design  Semantic Data Control ➠ View Management ➠ Data Security.
BCDM Temporal Domains - Time is linear and totally ordered - Chronons are the basic time unit - Time domains are isomorphic to subsets of the domain of.
GLARE (GuideLine Acquisition Representation and Execution) Paolo Terenziani Dipartimento di Informatica, Universita’ del Piemonte Orientale “Amedeo Avogadro”,
Time in Databases CSCI 6442 With thanks to Richard Snodgrass, 1985 ACM /85/005/0236.
Background information Formal verification methods based on theorem proving techniques and model­checking –to prove the absence of errors (in the formal.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Midterm Review Lecture 14b. 14 Lectures So Far 1.Introduction 2.The Relational Model 3.Disks and Files 4.Relational Algebra 5.File Org, Indexes 6.Relational.
CS240A: Databases and Knowledge Bases A Taxonomy of Temporal DBs Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
CS240A: Databases and Knowledge Bases Time Ontology and Representations Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
The need for Temporal Databases (1)Need for coping with temporal data (both VT and TT) (2)Just adding 1 (or 2, or 4) temporal attributes (and maybe some.
Query Processing Presented by Aung S. Win.
Institute for Integrated and Intelligent Systems - IIIS Bela Stantic School of Information and Communication Technology – Griffith University, Australia.
Limitations of the relational model. Just as the relational model supplanted the network and hierarchical model so too will the object – orientated model.
The Telic\Atelic Distinction in Temporal Databases Paolo Terenziani Institute of Computer Science, DISIT, Univ. Piemonte Orientale “A. Avogadro”, Viale.
The Telic\Atelic Distinction in Temporal Databases Paolo Terenziani Institute of Computer Science, DISIT, Univ. Piemonte Orientale “A. Avogadro”, Viale.
Chapter 9 Integrity. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.9-2 Topics in this Chapter Predicates and Propositions Internal vs.
Database Management 9. course. Execution of queries.
Recent research : Temporal databases N. L. Sarda
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 2: Intro to Relational.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
“INTRODUCTION TO DATABASE AND SQL”. Outlines 2  Introduction To Database  Database Concepts  Database Properties  What is Database Management System.
Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.
FEN Introduction to the database field:  The Relational Model Seminar: Introduction to relational databases.
Clinical Guidelines Contextualization in GLARE Alessio Bottrighi*, Paolo Terenziani*, Stefania Montani*, Mauro Torchio #, Gianpaolo Molino # *DI, Univ.
Temporal Mediators: Integration of Temporal Reasoning and Temporal-Data Maintenance Yuval Shahar MD, PhD Temporal Reasoning and Planning in Medicine.
Temporal Constraint Management in Artificial Intelligence - Introduction: time & temporal constraints - The problem - Survey of AI approaches to temporal.
1 Functional Dependencies and Normalization Chapter 15.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 2: Intro to Relational.
Chapter 2 Introduction to Relational Model. Example of a Relation attributes (or columns) tuples (or rows) Introduction to Relational Model 2.
Chapter 2: Intro to Relational Model. 2.2 Example of a Relation attributes (or columns) tuples (or rows)
A Logic of Partially Satisfied Constraints Nic Wilson Cork Constraint Computation Centre Computer Science, UCC.
Applying AI temporal reasoning techniques to Clinical Guidelines Luca Anselma%, Paolo Terenziani*, Stefania Montani*, Alessio Bottrighi* %DI, Università.
1 CS 430 Database Theory Winter 2005 Lecture 4: Relational Model.
GLARE: a Domain-Independent System for Acquiring, Representing and Executing Clinical Guidelines Paolo Terenziani, Stefania Montani, Alessio Bottrighi,
ECOMPOSE: development of Executable COntent in Medicine using Proprietary and Open Standards Engineering Dipartimento di Informatica, Universita’ del Piemonte.
CS240A: Databases and Knowledge Bases Temporal Databases Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
Formal Verification. Background Information Formal verification methods based on theorem proving techniques and model­checking –To prove the absence of.
Temporal Data Modeling
CS240A: Databases and Knowledge Bases TSQL2 Carlo Zaniolo Department of Computer Science University of California, Los Angeles Notes From Chapter 6 of.
1 The T4SQL Temporal Query Language Presented by 黃泰豐 2007/12/26.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 2: Intro to Relational.
Artificial Intelligence Knowledge Representation.
Chapter 8: Concurrency Control on Relational Databases
Module 2: Intro to Relational Model
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Paolo Terenziani, Alessio Bottrighi, Stefania Montani
Introduction to Relational Model
Relational Model By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany)
Chapter 2: Intro to Relational Model
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Data Models.
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Relational Algebra Chapter 4, Part A
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Database management concepts
Relational Algebra Chapter 4, Sections 4.1 – 4.2
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Example of a Relation attributes (or columns) tuples (or rows)
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
Extending computer guideline system with advanced AI and DB facilities
CS240A: Databases and Knowledge Bases TSQL2
CS240A: Databases and Knowledge Bases A Taxonomy of Temporal DBs
Presentation transcript:

Applying formal methods to clinical guidelines: the case of temporal information Paolo Terenziani Dipartimento di Informatica, Univ. del Piemonte Orientale “Amedeo Avogadro”, Via Bellini 25\g, Alessandria, Italy ­ ­ ­ ­ Aberdeen, June 4 th, 2008

My favourite research topic: Dealing with time-related phenomena Knowl. Repr. Constraints Periodicity TDB Semantics Constraints DB Periodicity DB Medical Info. TIME

Overview of the presentation experience in the GL context: the GLARE project (sketch) temporal information in clinical records  temporal databases temporal information in the guidelines  reasoning about temporal constraints conclusions

GLARE (GuideLine Acquisition Representation and Execution) -Joint project (started in 1997): Dept. Comp. Sci., Univ. Alessandria (It): P. Terenziani, S.Montani, A.Bottrighi Dept. Comp. Sci., Univ. Torino (It): L.Anselma,G.Correndo Az. Osp. S. Giovanni Battista, Torino (It): G.Molino, M.Torchio Koinè Sistemi S.p.A, Torino (It) (from 2005) -Domain independent (e.g., bladder cancer, reflux esophagitis, heart failure, ichemic stroke) -Phisician-oriented & User-friendly

GLARE Representation Formalism In GLARE, a clinical guideline is a hierarchical graphs of actions

GLARE: Architecture of the KERNEL Clinical DB Pharmac. DB Resource DB ICD DB Expert Physician Acquisition Interface Guidelines DB Knowledge Manager User Physician Execution Interface Guidelines Instantiation DB Patient DB Execution Module

GLARE: Architecture of the system GLARE KERNEL Decision-Making Module Temporal Reasoning Module Update Module Contextualization Module Model-Checking Module SPIN

Overview of the presentation experience in the GL context: the GLARE project (sketch) temporal information in clinical records  temporal databases (TDBs) temporal information in the guidelines  reasoning about temporal constraints conclusions

Coping with temporal information in patients’ records: the need for Temporal Databases (1)Need for coping with temporal data (both VT and TT) (2)Just adding 1 (or 2, or 4) temporal attributes (and maybe some ad-hoc procedures) does not work! (3)First, a rigorous semantic framework is needed, to give formal specification to the implementation. Properties: clearness, expressiveness, upward compatibiliy. Ex. BCDM (4)Second, the implementation must be proven to respect the semantics. Core issue here: efficient (1-NF) implementations hardly grant uniqueness of representation. Ex TSQL2 (5)Other issues (my work in the area)

(1) Need for coping with temporal data in patients’ clinical records The time when patients’ symptoms hold, laboratory examinations and clinical actions are performed, etc. are core pieces of information  valid time The time when data are inserted\deleted into the DB is important (e.g., justification of actions, legal purposes)  transaction time Ex. Lab test X has been executed on Jan 1 st, 2008, and results have been inserted in the patient’s clinical record on Jan 3 rd NOTICE: major problems even with valid time alone!

Coping with temporal information in patients’ records: the need for Temporal Databases (1)Need for coping with temporal data (both VT and TT) (2)Just adding 1 (or 2, or 4) temporal attributes (and maybe some ad-hoc procedures) does not work! (3)First, a rigorous semantic framework is needed, to give formal specification to the implementation. Properties: clearness, expressiveness, upward compatibiliy. Ex. BCDM (4)Second, the implementation must be proven to respect the semantics. Core issue here: efficient (1-NF) implementations hardly grant uniqueness of representation. Ex TSQL2 (5)Other issues (my work in the area)

Ad-hoc approaches are complex and are not likely to work Example 1. Projection (and temporal coalescing) NameSalaryTitleVT_startVT_end Andrea60000Ass. Provost1/1/199330/5/1993 Andrea70000Ass. Provost1/6/199330/9/1993 Andrea70000Provost1/10/199331/1/1994 Andrea70000Professor1/2/199431/12/1994 Question: salary history of Andrea “Intuitive” SQL query: SELECT Salary,VT_start,VT_end FROM Employee WHERE Name=“Andrea”

Ad-hoc approaches are complex and are not likely to work Result obtained: SalaryVT_startVT_end /1/199330/5/ /6/199330/9/ /10/199331/1/ /2/199431/12/1994 Desired result: SalaryVT_startVT_end /1/199330/5/ /6/199331/12/1994

Ad-hoc approaches are complex and are not likely to work How to get the desired result using SQL92: CREATE TABLE Temp(Salary, VT_start,VT_end) AS SELECT Salary, VT_start, VT_end FROM Employee WHERE Name = “Andrea”; Repeat UPDATE Temp T1 SET (T1.VT_end)=SELECT MAX(T2.VT_end) WHERE T1.Salary=T2.Salary AND T1.VT_start < T2.VT_Start AND T1.VT_end >= T2.VT_start AND T1.VT_end < T2.VT_end WHERE EXISTS (SELECT * FROM Temp AS T2 WHERE T1.Salary=T2.Salary AND T1.VT_start = T2.VT_start AND T1.VT_end < T2.VT_end) Until no tuples updated

Ad-hoc approaches are complex and are not likely to work How to get the desired result using SQL92 (continues!): DELETE FROM Temp T1 WHERE EXISTS(SELECT * FROM Temp AS T2 WHERE T1.Salary = T2.Salary AND ((T1.VT_start > T2.VT_Start) AND (T1.VT_end = T2.VT_Start) AND (T1.VT_end < T2.VT_end))

Ad-hoc approaches are complex and are not likely to work Underlying semantic phenomenon: Projection on temporal relations involves temporal coalescing about value equivalent tuples When it occurs (SQL): Whenever a proper subset of the attributes of the relations is chosen in the SELECT part of the query

Ad-hoc approaches are complex and are not likely to work How to get the desired result using a Temporal DB (ex. TSQL2) SELECT Salary FROM Employee WHERE Name = “Andrea”

Ad-hoc approaches are complex and are not likely to work Example 2. Join (and temporal intersection) NameSalaryVT_startVT_end Andrea600001/1/199330/5/1993 Andrea700001/6/199331/12/1994 Employee1 NameTitleVT_startVT_end AndreaAss. Provost1/1/199330/9/1993 AndreaProvost1/10/199331/1/1994 AndreaProfessor1/2/199431/12/1994 Employee2 Query: “combined” history of both Andrea’s salary and title

Ad-hoc approaches are complex and are not likely to work “Intuitive” SQL query: SELECT Salary, Title, Emp1.VT_start, Emp1.VT_end Emp2.VT_start, Emp2.VT_end FROM Employee1, Employee2 WHERE Employee1.Name=“Andrea” AND Employee1.Name=“Andrea”

Ad-hoc approaches are complex and are not likely to work Result obtained: SalaryEmp1. VT_start Emp1. VT_end TitleEmp2. VT_start Emp2. VT_end /1/199330/5/1993Ass. Provost1/1/199330/9/ /1/199330/5/1993Provost1/10/199331/1/ /1/199330/5/1993Professor1/2/199431/12/ /6/199331/12/1994Ass. Provost1/1/199330/9/ /6/199331/12/1994Provost1/10/199331/1/ /6/199331/12/1994Professor1/2/199431/12/1994

Ad-hoc approaches are complex and are not likely to work Result desired: SalaryTitleVT_startVT_end 60000Ass. Provost1/1/199330/5/ Ass. Provost1/6/199330/9/ Provost1/10/199331/1/ Professor1/2/199431/12/1994

Ad-hoc approaches are complex and are not likely to work How to get the desired result using SQL92: SELECT Employee1.Name,Salary,Dept,Employee1.VT_start,Employee1.VT_end FROM Employee1, Employee2 WHERE Employee1.Name=Employee2.Name AND Employee2.VT_start <= Employee1.VT_start AND Employee1.VT_end < Employee2.VT_end UNION SELECT Employee1.Name,Salary,Dept,Employee1.VT_start,Employee2.VT_end FROM Employee1, Employee2 WHERE Employee1.Name=Employee2.Name AND Employee1.VT_start >= Employee2.VT_start AND Employee2.VT_end < Employee1.VT_end AND Employye1.VT_start < Employee2.VT_end UNION SELECT Employee1.Name,Salary,Dept,Employee2.VT_start,Employee1.VT_end FROM Employee1, Employee2 WHERE Employee1.Name=Employee2.Name AND Employee2.VT_start > Employee2.VT_start AND Employee1.VT_end < Employee2.VT_end AND Employye2.VT_start < Employee1.VT_end UNION SELECT Employee1.Name,Salary,Dept,Employee2.VT_start,Employee2.VT_end FROM Employee1, Employee2 WHERE Employee1.Name=Employee2.Name AND Employee2.VT_start > Employee1.VT_start AND Employee2.VT_end < Employee1.VT_end

Ad-hoc approaches are complex and are not likely to work Underlying semantic phenomenon: Join (Cartesian product) on temporal relations involves temporal intersection When it occurs (SQL): Whenever more than one relation is used in the FROM part of the query Note: the number of terms in the SQL union is 2 n, where n is the number of relations in the FROM part

Ad-hoc approaches are complex and are not likely to work How to get the desired result using a Temporal DB (ex. TSQL2) SELECT Salary, Title FROM Employee1, Employee2 WHERE Employee1.Name=“Andrea” AND Employee1.Name=“Andrea”

Ad-hoc approaches are complex and are not likely to work And what about: - Union, difference, …, nested queries - Temporal predicates - Primary\secondary keys - Aggregate functions - Integrity constraints - Multiple (user-defined!) granularities - …… - arbitrary combinations of all the above issues Until now, just two simple examples concerning: - SELECT a subset of attributes (  loop to do colaescing) - FROM with >1 relations (exponential union to do intersection) ?

Ad-hoc approaches are complex and are not likely to work Key message: - Dealing with temporal data is a general problem in DB’s - Difficult problems (often “hidden” ones) have to be faced From a Software Engeneering point of view: Letting applications solve the problem in an ad-hoc way is: - Both cost and time expensive - Likely to lead to errors in the applications - Likely to make integration (shared data) between applications impossible

Ad-hoc approaches are complex and are not likely to work Temporal DB: an area of research aiming at providing once-and-forall principled and integrated solution to the problem

Coping with temporal information in patients’ records: the need for Temporal Databases (1)Need for coping with temporal data (both VT and TT) (2)Just adding 1 (or 2, or 4) temporal attributes (and maybe some ad-hoc procedures) does not work! (3)First, a rigorous semantic framework is needed, to give formal specification to the implementation. Properties: clearness, expressiveness, upward compatibiliy. Ex. BCDM (4)Second, the implementation must be proven to respect the semantics. Core issue here: efficient (1-NF) implementations hardly grant uniqueness of representation. Ex TSQL2 (5)Other issues (my work in the area)

BCDM (Bitemporal Conceptual Data Model) Snodgrass & Jensen, 1995 A unifying “consensus” model, capturing the semantics of most approaches in the TDB literature - Based on the relational model, and tuple timestamping - A purely semantic model (efficiency issues are not taken into account)

BCDM Temporal Domains - Time is linear and totally ordered - Chronons are the basic time unit - Time domains are isomorphic to subsets of the domain of Natural numbers D VT = {t 1,t 2, …, t k }(valid time) D TT = {t’ 1,t’ 2, …, t’ h }  {UC} (transaction time) D TT  D VT (bitemporal chronons)

BCDM Data Attribute names: D A ={A 1, A 2, …, A n } Attribute domains D D ={D 1, D 2, …, D n } Schema of a bitemporal relation: R = A i1, A i2, …, A ij T Domain of a bitemporal relation: D i1  D i2  …  D ij  D TT  D VT Tuple of a relation r(R): x = (a 1, a 2, …, a j | t B )

BCDM Example. Relation Employee with Schema: (name,salary,T) “Andrea was earning 60K at valid times 10, 11, 12 Such a tuple has been inserted into Employee at time 12, and is current now (say now=13)” (Andrea, 60k | {(12,10), (12,11), (12,12),(13,10), (13,11), (13,12), ……}) VT TT

BCDM Example. Relation Employee with Schema: (name,salary,T) “Andrea was earning 60K at valid times 10, 11, 12 Such a tuple has been inserted into Employee at time 12, and is current now (say now=13)” (Andrea, 60k | {(12,10), (12,11), (12,12),(13,10), (13,11), (13,12), (UC,10), (UC,11), (UC,12)}) VT TT UC

BCDM Bitemporal relation: set of bitemporal tuples. Constraint: Value equivalent tuples are not allowed. (Bitemporal) DB: set of (bitemporal) relations

BCDM Semantics (another viewpoint) NameSalaryT Andrea60K{(12,10), (12,11), (12,12),(13,10), (13,11), (13,12), (UC,10), (UC,11), (UC,12)} John50K{(12,12),(12, 13)} (12,10)  {Employee(Andrea,60K)} (12,11)  {Employee(Andrea, 60K)} (12,12)  {Employee(Andrea, 60K), Employee(John,50K)} (12,13)  {Employee(John,50K)} (13,10)  {Employee(Andrea,60K)} (13,11)  {Employee(Andrea, 60K)} …….. (UC,12)  {Employee(Andrea, 60K)}

BCDM PROPERTIES Consistent extension (of “classical” SQL DB) A temporal DB is a set of “classical” DBs, one for each bitemporal chronon Uniqueness of representation (from the constraint about value equivalent tuples)

BCDM Semantics of UC e.g., the DB’s clock thicks time 14 NameSalaryT Andrea60K{(12,10), (12,11), (12,12), (13,10), (13,11),(13,12), (UC,10), (UC,11), (UC,12)} John50K{(12,12),(12, 13)} NameSalaryT Andrea60K{(12,10), (12,11), (12,12), (13,10), (13,11), (13,12), (14,10), (14,11), (14,12), (UC,10), (UC,11), (UC,12)} John50K{(12,12),(12, 13)}

BCDM deletion (e.g., at time 15) delete(Employee, (Andrea,60K)) NameSalaryT Andrea60K{(12,10), (12,11), (12,12), (13,10), (13,11), (13,12), (14,10), (14,11), (14,12), (UC,10), (UC,11), (UC,12)} John50K{(12,12),(12, 13)} NameSalaryT Andrea60K{(12,10), (12,11), (12,12), (13,10), (13,11), (13,12), (14,10), (14,11), (14,12), (UC,10), (UC,11), (UC,12)} John50K{(12,12),(12, 13)}

BCDM insertion (e.g., at time 16) insert(Employee, (Andrea,60K|{12,13})) insert(Employee, (Mary,70K|{16})) NameSalaryT Andrea60K{(12,10), (12,11), (12,12),(13,10), (13,11), (13,12), (14,10), (14,11), (14,12), (UC,10), (UC,11), (UC,12)} John50K{(12,12),(12, 13)} NameSalaryT Andrea60K{(12,10), (12,11), (12,12),(13,10), (13,11), (13,12), (14,10), (14,11), (14,12), (16,12),(16,13),(UC,12),(UC,13) } John50K{(12,12),(12, 13)} Mary70K{(16,16),(UC,16)}

BCDM π D (r)={z | ∃x  r (z[D]=x[D]) ∧ ∀ y  r (y[D]=z[D] ⇒ y[T]  z[T]) ∧ ∀ t  z[T] ∃y  r (y[D]=z[D] ∧ t  y[T])} Algebraic Operators (Ex. Projection) - No value-equivalent tuple generated (uniqueness of representation!) - Coalescing!

BCDM BCDM algebraic operators are a consistent extension of SQL’s ones (reducibility and equivalence) Algebraic Operators Properties

BCDM Reducibility rTrT ρ t T (r T ) ρtTρtT op T (r T ) op T op op(ρ t T (r T )) ρtTρtT ρ t T (op T (r T )) =

BCDM Equivalence r τtτt τ t (r) op T op op(r) τtτt τ t (op(r))=op T (τ t (r))

BCDM PROBLEM Semantically clear but ….. inefficient (not suitable for a “direct” implementation) (1) Not 1-NF (2) UC (at each thick of the clock, all current tuples should be updated!)

Coping with temporal information in patients’ records: the need for Temporal Databases (1)Need for coping with temporal data (both VT and TT) (2)Just adding 1 (or 2, or 4) temporal attributes (and maybe some ad-hoc procedures) does not work! (3)First, a rigorous semantic framework is needed, to give formal specification to the implementation. Properties: clearness, expressiveness, upward compatibiliy. Ex. BCDM (4)Second, the implementation must be proven to respect the semantics. Core issue here: efficient (1-NF) implementations hardly grant uniqueness of representation. Ex TSQL2 (5)Other issues (my work in the area)

An example of implementation: TSQL2 (Snodgrass et al., 1995) Temporal attribute T  four temporal attributes (TT S, TT E, VT S, VT E ) Attribute value: a timestamp or UC Bitemporal tuple: A 1,….A n | TT S, TT E, VT S, VT E Bitemporal relation: set of bitemporal tuples Notice: to retain the expressiveness of BCDM, value-equivalent tuples need to be allowed!

An example of implementation: TSQL2 (Snodgrass et al., 1995) NameSalaryT Andrea60K{(12,10), (12,11), (12,12),(13,10), (13,11), (13,12), (UC,10), (UC,11), (UC,12)} John50K{(12,12),(12, 13)} NameSalaryTT S TT E VT S VT E Andrea60K12UC1012 John50K12 13 SEMANTICS BCDM TSQL2

An example of implementation: TSQL2 (Snodgrass et al., 1995) Efficient implementation (data model): - 1-NF - UC managed efficiently - clear semantics (mapping onto BCDM) BUT to get efficiency, we loose the uniqueness of representation property

Problem: no uniqueness of representation NameSalaryT Andrea60K{(10,2), (10,3), (11, 2),(11,3), (12,1), (12,2),(12,3),(12,4),(13,1),(13,2),(13,3),(13,4)} NameSalaryTT S TT E VT S VT E Andrea60K Andrea60K12UC14 NameSalaryTT S TT E VT S VT E Andrea60K12UC11 Andrea60K10UC23 Andrea60K12UC44 BCDM SEMANTICS TSQL2 (a) TSQL2 (b) Example. At time 10, the fact that Andrea earned 60K from 2 to 3 inserted in Employee. At time 12, such a tuple is updated: Andrea earned 60K from 1 to 4. At time 13, the tuple is (logically) deleted.

Problem: no uniqueness of representation VT TT TSQL2 implementation: “covering” rectangles

Problem: no uniqueness of representation VT TT TSQL2 Representation (a) NameSalaryTT S TT E VT S VT E Andrea60K Andrea60K12UC14

Problem: no uniqueness of representation VT TT TSQL2 Representation (b) NameSalaryTT S TT E VT S VT E Andrea60K12UC11 Andrea60K10UC23 Andrea60K12UC44

Problem: no uniqueness of representation VT TT Other TSQL2 Representations!!

Problem: no uniqueness of representation NameSalaryTT S TT E VT S VT E Andrea60K Andrea60K12UC14 NameSalaryTT S TT E VT S VT E Andrea60K12UC11 Andrea60K10UC23 Andrea60K12UC44 Potentially, an enormous problem! e.g., Return all employees earning more than 50K for at most 3 consecutive time chronons NameTT S TT E VT S VT E Andrea12UC14 ?

Problem: no uniqueness of representation One must grant that the temporal DB implementation respects its underlying semantics, independently of the representation DB1 DB2 op 1, …, op k DB1’ DB2’ op 1, …, op k Given two “semantically equivalent” temporal DBs, and given any sequence of operations, the results are always “semantic equivalent” Otherwise …..We cannot trust DB’s results!

Problem: no uniqueness of representation Solution. Step 1. Formal definition of “semantic equivalence” Snapshot equivalence: Informally: two relations (Databases) are snapshot equivalent if they are identical at each bitemporal chronon

Problem: no uniqueness of representation Solution. Step 2. Definition of manipulation and algebraic operators that preserve snapshot equivalence e.g., proofs given about TSQL2 (bitemporal) operators rB1rB1 rB2rB2 op B i op B i (r B 1 ) op B i (r B 2 ) snapshot equivalent snapshot equivalent

An example of implementation: TSQL2 (Snodgrass et al., 1995) Last, but not least, An high-level query and manipulation language (minimal extension to SQL to cope with time) is provided

An example of implementation: TSQL2 (Snodgrass et al., 1995) Other properties proved for TSQL2 (Upward Compatibility) TSQL2 is a consistent extension of SQL

Personal consideration In my opinion, ad-hoc approaches are not likely to - Recognize - Solve - Prove the soundness of the solutions the above problems!

Coping with temporal information in patients’ records: the need for Temporal Databases (1)Need for coping with temporal data (both VT and TT) (2)Just adding 1 (or 2, or 4) temporal attributes (and maybe some ad-hoc procedures) does not work! (3)First, a rigorous semantic framework is needed, to give formal specification to the implementation. Properties: clearness, expressiveness, upward compatibiliy. Ex. BCDM (4)Second, the implementation must be proven to respect the semantics. Core issue here: efficient (1-NF) implementations hardly grant uniqueness of representation. Ex TSQL2 (5)Other issues (my work in the area)

My past and current contribution to the area - Treatment of temporal constraints in TDB (with V. Brusoni, L. Console, B. Pernici: IEEE Trans. Data and Knowledge Eng., 1998) - Treatment of periodic data in TDB (IEEE Trans. Data and Knowledge Eng., 2003, and futher ongoing work with B.Stantic, G. Governatori, A. Sattar) - Treatment of telic vs atelic data in TDB (with R.T. Snodgrass, IEEE Trans. Data and Knowledge Eng., 2004) - Extension to BCDM to cope with proposal\evaluation of updates (with L. Anselma, A. Bottrighi, S. Montani, ongoing work) - Extension to BCDM to cope with temporal indeterminacy (with R.T. Snodgrass, ongoing work)

Treatment of telic vs atelic data in TDB (joint work with R.T. Snodgrass) Distinction dates back to Aristotle’s Categories, used in Linguistics, Cognitive Science, and, recently, Artificial Intelligence Core issue: Atelic facts (i.e., facts without a goal or culmination) e.g., (f1) Andrea earned 60K from 10 to 11 (f2) Andrea earned 60K from 12 to 12 can be represented\evaluated at each bitemporal chronon 10  {earn(Andrea,60K)} 11  {earn(Andrea,60K)} 12  {earn(Andrea,60K)}  BCDM “snapshot” semantics is OK for atelic facts

Treatment of telic vs atelic data in TDB Core issue: Telic facts (i.e., facts with a goal or culmination) e.g., (f1) Andrea build a house from 10 to 11 (f2) Andrea build a house from 12 to 12 cannot be represented\evaluated at each bitemporal chronon 10  {build_house(Andrea)} 11  {build_house(Andrea)} 12  {build_house(Andrea)}  BCDM “snapshot” semantics does not work for telic facts

Treatment of telic vs atelic data in TDB (1)Analysis of the impact of neglecting the telic\atelic distinction in TDBs (2)Solution: (2.1)a two-sorted semantics -chronon-by-chronon for atelic facts -interval-based for telic facts (2.2)three sorted data model (2.3)three-sorted manipulation and algebraic operators (2.4)properties (2.5)high-level query language

Treatment of proposal\evaluation of updates (extension to BCDM) (joint work with L. Anselma, A. Bottrighi, S. Montani) Needed to cope with updates concerning GLs! Important in several other applications, e.g., Wiki-based vocabularies (Citizendium) Notice: TT needed to store the history of updates! Main issues: -Distinguishing between two level of users (proposers and evaluators) -Delaying the effect of proposals -Coping with alternative tuples

Overview of the presentation experience in the GL context: the GLARE project (sketch) temporal information in clinical records  temporal databases (TDB) temporal information in the guidelines  reasoning about temporal constraints conclusions

Temporal Constraints in Clinical Guidelines Temporal constraints are an intrinsic part of clinical knowledge (e.g., ordering of the therapeutic actions) Different kinds of temporal constraints, e.g., -duration of actions (min / max) -qualitative constraints (e.g., before, during) -delays (min / max) -periodicity constraints on repeated actions

Temporal Constraints in Clinical Guidelines repetitions The therapy for multiple mieloma is made by six cycles of 5-day treatment, each one followed by a delay of 23 days (for a total time of 24 weeks). Within each cycle of 5 days, 2 inner cycles can be distinguished: the melphalan treatment, to be provided twice a day, for each of the 5 days, and the prednisone treatment, to be provided once a day, for each of the 5 days. These two treatments must be performed in parallel.

Managing Temporal Constraint: the Problem Temporal Constraints without Temporal Reasoning (constraint propagation) -are useless -clash against users’ intuitions/expectations Both representation and inference are NEEDED

Managing Temporal Constraints: the Problem Implied constraint (temporal reasoning): (1.6) C ends between 30 and 60 m after the start of A ABC Correct (consistent) assertion: (1.7) C ends between 30 and 50 m after the start of A Not correct (inconsistent) assertion: (1.8) C ends more than 70 m. after the start of A However: Temporal Reasoning is NEEDED in order to support such an intended semantics!

Managing Temporal Constraints: the Problem DESIDERATA for Temporal Reasoning Algorithms - tractability  “reasonable” response time - correctness  no wrong inferences - completeness  reliable answers DESIDERATA for the Representation formalism - expressiveness  capture most temporal constraints in GL TRADE-OFF! SPECIALIZED APPROACHES (since ’80 in AI literature)

Digression: Why Completeness is fundamental? Implied constraint (temporal reasoning): (1.6) C ends between 30 and 60 m after the start of A ABC Suppose that temporal reasoning is NOT complete, so that (1.6) is not inferred. One can add inconsistent info, e.g., C ends more than 70 m. after the start of A and the system accept it!  inconsistent GL! Complete Temporal Reasoning is NEEDED in order to grant that GL is temporally consistent (executable!)

Temporal Constraint Treatment WHEN Temporal Reasoning is useful in Guidelines? ACQUISITION - to check consistency (notice: an inconsistent GL cannot be executed!!!) EXECUTION -to compare the duration of paths, in hypothetical reasoning (simulation) facilities -to check that the time of execution of actions on patients is consistent with the constraints in the guideline -to schedule next actions

Temporal Representation and Temporal Reasoning for Clinical Guidelines Different kinds of temporal constraints No current approach in the AI literature covers all of them Our proposal: an extension of the “consensus” STP approach [Dechter et al., 91] Our goal: expressiveness + correct, complete and tractable inferences

GLARE’s approach (representation) Labeled tree of STPs (STPs-tree) Tree of STPs for the multiple mieloma chemotherapy guideline. The overall therapy (node N1) is composed by 6 cycles of 5 days plus a delay of 23 days. In each cycle (node N2), two therapies are executed in parallel: Alkeran (node N3: Sa and Ea are the starting and ending nodes), to be repeated twice a day, and Deltacorten (node N4: Sd and Ed are the starting and ending nodes), to be repeated once a day. Arcs between any two nodes X and Y in a STP (say N2) of the STP-tree are labeled by a pair [n,m] representing the minimal and maximal distance between X and Y.

Other formal approaches to temporal reasoning in clinical guidelines Miksh et al. -an extension of STP’s metric contraints -attention to constraint visualization Shahar et al. -temporal reasoning oriented towards temporal abstraction (e.g., persistence phenomena)

Conclusions Formal methods are necessary in order to properly cope with temporal phenomena related to GL management Temporal information in patients’ records  TDBs needed - ad-hoc solutions are complex, expensive, not likely to work -a clear semantics is needed - an efficient (1-NF) implementation respecting the semantics is needed. (semantic equivalence must be preserved, otherwise, results cannot be trusted!) - any implementation should be a consistent extension of SQL (otherwise, previous work is lost!) Temporal constraints in GLs  Temporal reasoning needed -(temporally) inconsistent GL cannot be executed! -to detect which are the next candidate actions, and when they need to be executed

Conclusions More generally, I strongly believe that formal methods are recommended\necessary in order to properly cope with many other different phenomena related to GL management, including - Decision support (e.g., Decision Theory) - Verification (e.g., model-checking, theorem proving, Petri Nets) - Simulation (e.g., Petri Nets) - Formal semantics (Temporal logics, Operational semantics, Mapping techniques) - ……… !!! Thanks for your attention!