Presentation on theme: "Logic form identification of medical clinical trials Clint Tustison."— Presentation transcript:
Logic form identification of medical clinical trials Clint Tustison
2 Introduction The what … Identify and extract logic forms from medical clinical trials (in)eligibility criteria The why … Understand the data Match up the information with other data, i.e., patients ’ medical records The how … Syntactic parser Cognitive modeling architecture
4 Input ClinicalTrials.gov Sponsored by NIH and other federal agencies, private industry 8,800 current trials online 3,000,000 page views per month Purpose, eligibility, location, more info.
5 Text processing Convert trials to.xml format Eligibility Criteria Inclusion criteria: Adenocarcinoma of the pancreas.
6 Process: Input Syntactic Parser Clinical Trials (input) A criterion equals adenocarcinoma of the pancreas. Cognitive modeling engine Predicate Calculus Post- Processing (output)
7 Syntactic parser Link-Grammar Parser Characteristics Syntactic dependency parse Constraints for determining grammaticality Links give clues on how to process constituents Benefits written in C very fast Robust - ability to process spelling errors Free - Can be easily integrated with other applications
8 Process: Syntactic Parser Syntactic parser A criterion equals adenocarcinoma of the pancreas Xp Wd Js----+ | | +--Ds Ss Os Mp Ds--+ | | | | | | | | | | LEFT-WALL a criterion.n equals.v adenocarcinoma[?].n of the pancreas.n.
9 Intelligent Processing Soar Architecture Model and theory of cognition used in AI programming Translates syntactic parse to logic output by reading links Benefits Goal-directed problem solving Agent-based architecture Ability to learn Proven in multiple applications Natural Language-Soar Tactical Air-Soar Nasa Test Director-Soar
13 Post-processing Prolog axioms Remove elements not included in language of the criterion). Format elements needed in output (ampersands). Reduce(Z, Y) :- member(Criterm, Y), functor(Criterm, criterion, 1), arg(1, Criterm, Critvar), member(Predterm, Y), functor(Predterm, Xterm, 1), arg(1, Predterm, Predvar), member(Equalsterm, Y), functor(Equalsterm, equals, 2), arg(1, Equalsterm, Critvar), arg(2, Equals, Critvar, Predvar), delete(Y, Criterm, Z2), delete(Z2, Equalsterm, Z). Turns previous statement: criterion(N2) & adenocarcinoma(N4) & pancreas(N5) & equals(N2,N4) & of(N4,N5). Into: adenocarcinoma(N4) & pancreas(N5) & of(N4,N5).
14 Output Eligibility Criteria Inclusion Criteria: Adenocarcinoma of the pancreas pancreas(N5) & adenocarcinoma(N4)& of(N4,N5)..
15 Results/Conclusion Data can be matched up with patients’ medical records to determine if they meet criteria posted in the clinical trial. Disadvantages Grammar is difficult to write Only one parsed output per utterance Advantages Fast Robust Implementation in other languages Can be easily integrated with other applications/corpora