Presentation is loading. Please wait.

Presentation is loading. Please wait.

AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez.

Similar presentations


Presentation on theme: "AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez."— Presentation transcript:

1 AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez Workshop on Artificial Intelligence and Law XXII World Congress of Philosophy of Law and Social Philosophy Granada, May 2005

2 AIFB May 25, Introduction to SEKT Project and Legal Case Study Methodology OPJK Improving knowledge discovery on the competency questions Architecture Agenda

3 May 25, The inSEKTs BT University of Sheffield Vrije Universiteit Amsterdam Sirma AI Empolis Universität Karlsruhe Ontoprise Universitat Autònoma de Barcelona Universität Innsbruck Jozef Stefan Institute iSOCO Kea-pro

4 AIFB May 25, SEKT Main goals of SEKT European Leadership in Semantic Technologies Core Research Combine Human Language Technologies, Knowledge Discovery and Ontology Technologies Provide intelligent knowledge access

5 AIFB May 25, Description of the Problem: Legal Domain In General: Complaint about diligence of legal administration. The Judges are overworked. In Particular: New Judges A lot of theoretical knowledge, but few practical knowledge On Duty. When they are confronted with situations in which they are not sure what to do “Disturb” experienced judges with typical questions. Usually his/her former tutor (Preparador) Existing Technology Legal Databases Essential in their daily work Based on keywords and boolean operators A search retrieves a huge number of hits

6 May 25, Description of the Problem: Legal Domain Solution: Design an intelligent system to help new judges with their typical problems. Extended FAQ system using Semantic Web technologies Connect the FAQ system with the exiting jurisprudence. Search Jurisprudence using Semantic Web technologies.

7 AIFB May 25, LLD [Language for Legal Discourse, L.T. McCarty, 1989]: Atomic formula, Rules and Modalities. NOR [Norma, R.K. Stamper, 1991, 1996]: Agents Behavioral invariants, Realizations. LFU [Functional Ontology for Law, R.W. van Kranlinger; P.R.S. Visser, 1995]: Normative Knowledge, World knowledge, Responsibility knowledge, Reactive knowledge and Creative knowledge. FBO [Frame-Based Ontology of Law, A. Valente, 1995]: Norms, Acts and Concepts Descriptions]. LRI-Core Legal Ontology [J. Breuker et al., 2002]: Objects, Processes, Physical entities, Mental entities, Agents, Communicative Acts. IKF-IF-LEX Ontology for Norm Comparaison [A. Gangemi et al., 2001]: Agents, Institutive Norms, Instrumental provisions; Regulative norms; Open-textured legal notions, Norm dynamics. State of the Art in Legal Ontologies

8 AIFB May 25, Professional Knowledge (PK) Legal Knowledge (LK)  Legal Core Ontologies (LCO) [based on General Theories of Law] Legal Professional Knowledge (LPK)  OPLK Judicial Professional Knowledge (JPK)  OPJK Conceptual distinctions

9 May 25, Total Autonomous Communities: 14 (out of 17) Ethnographic survey

10 AIFB May 25, Statistical analysis of results Judicial units: heterogeneity Judge’s profile Protocols of analysis Literal transcripts Completed questionnaires List of extracted questions Preliminary exploitation of data

11 AIFB May 25, Identification of possible concepts through ALCESTE’s results and TextToOnto conceptual distribution Domain detection Competency questions discussion and concept extraction OPJK Modeling

12 AIFB May 25, JUDGE ON-DUTY FAMILY ISSUES IMMIGRATION REAL ESTATE DECISION- MAKING & JUDGMENTS PROCEEDINGSJUDICIAL CLERKS COMMERCIAL LAW CONTRACT LAW CRIMINAL LAW GENDER VIOLENCE ORDER OF PROTECTION / INJUNCTION Intuitive ontological subdomains

13 AIFB May 25, Term extraction using TextToOnto

14 AIFB May 25, Term extraction using TextToOnto and Spanish Gate

15 AIFB May 25, Identify important concepts that should be represented 2.Hierarchy construction 3.Identify relations between them 4.Redefine the ontology repeting steps 1-4

16 AIFB May 25, Selecting (underlying) all the nouns (usually concepts) and adjectives (usually properties) contained in the competency questions. ¿Cuál es el tratamiento de las denuncias manifiestamente inverosímiles o relativas a hechos que evidentemente carecen de tipicidad? ¿Y si se trata de una querella que reúne todos los demás presupuestos procesales pero los hechos objeto de la misma carecen de relevancia penal o manifiestamente falsos? ¿Qué ocurre si comparece en el juzgado una persona que quiere denunciar hechos difícilmente creíbles, sin relación entre sí, dudándose por el juez de la capacidad mental del denunciante? ¿Ante quién debe interponerse el recurso de reforma contra la prisión, delante del juez de guardia o del juez que dictó el correspondiente auto de prisión? Competency question discussion

17 AIFB May 25, OPJK classes identified

18 AIFB May 25, OPJK and Proton Integration

19 AIFB Improving knowledge discovery on the competency questions

20 AIFB May 25, Data: 3 text corpora (judges’ questions): Corpus 1: Scholar “on duty” questions (Spanish Judicial School = 99) Corpus 2: Practical “on duty” questions (= 163) (field work) Corpus 3: All practical questions (=756)(field work) Method: TEXT GARDEN (J. Stefan Institute, Ljubljana) ALCESTE -Analysis of the co-occurring lexemes within the simple statements of a text [Reinert 2002, 2003] Data and Method

21 AIFB May 25, The text needs to be represented in an appropriate way for statistical analysis: 1.Breaking text into “units” (lines, sentences, …) 2.Morphological categorization (adjectives, prepositions, …) 3.Putting words into canonical form: a)Lemmatization (is,was,are → be) b)Stemming (loved, loving → lov+) 4.Analysis: a)Clustering b)Latent semantic indexing c)Correspondence analysis d)Classification e)Visualization Analysis of Text

22 ALCESTE (Reinert,1988) Corpus Segmented in chunks Classes of related chunks List of typical words related to each class {} {} {} Geometric representation Hierarchical descending clustering Correspondence analysis Folch & Habert (2000)

23 Example of Correspondence Analysis and Visualization | | | | | | | solo| | 19| | parte+ | 18| | monitorio demand+ | 17| | archiv+accion+ | 16| present+ | falta+ vehiculo+fase+ | 15| | seguir procurador+ | 14| |recurso+ pago+quiebra+ | 13| ofici+| gasto+..ejecut+ejecucion+ | 12| sido dia+.finca+embarg+verbal+ | 11| interes+traficoacto+.notificacionentrega+ | 10| momentocelebr+hall+ cuantia+resolver | 9 | valor+ |auto+admit+qued+.juicio+deposit+ | 8 | lesion+ venirdinero.. notific+pericial+ | 7 | | si vista+aport+inform+ | 6 madreacord+viviend+ | cabo solicit+ | 5 | victima+maridoempresa+ | llev+ ya prueba+abogado+ | 4 |..tratosproteccion | | 3 |.senor+alejamiento | responsabili | 2 tema+mujer+malo+violencia | | 1 | denunci+medida+visitas | | separacion+orden venirfiscal | pidepresun+ | | 2 | | | 3 | | | 4 | | | 5 | | | 6 | | | 7 | dict+ | | 8 | | | 9 | | | 10| | | 11| | | 12| | | 13| | | 14| | un | 15| | | 16| | levantamient | 17| | tenerdeten+ libertadforense | 18| |person+.....hacercausa+asunto+ | 19| servicio judicial+actuacion+ | 20| guardia+. juezllam+...policiadetenido+ | 21| | partido+ | | | | | | | ALCESTE TEXT GARDEN

24 Example of Clustering Class 1: Judicial unit funcionar+ (21), juzgar(26), oficina(11), trabaj+(13), decir(26), llam+(16), mand+(12), acudir(11), adjunto(4), busc+(4), consult+(4), dato(6), hablar(4), jurisprudencia(3), local+(3), material(6), necesit+(7), policia(14), prensa(4), sala(4), funerari+(2), hurto(3), informacion(5), miedo(3), robo(3), servicio+(7), sustitu+(4), tecnico(2), venir(15) Class 2: Family law alejamiento(22), malo(22), medida(16), orden+(23), proteccion(17), senor+(13), trat+(22), victima(11), mujer(11), padre(7), denunci+(12), domestico(8), violencia(8), agresor(4), dict+(10), madre(7), marido(6), nino(5), pension(4), psicolog+(5), separacion(5), abus+(5), alimento(3), ayud+(4), casa(3), cautelar+(3), divorcio(2), empresa(3), hijo(4), lesion+(6) Class 3: Proceedings escrit+(9), fiscal+(13), instruccion(9), ordinario(5), seguir(11), acumular(5), audiencia-provincia(2), conform+(2), contradictori+(3), criterio+(10), cuantia(5), falt+(7), injusto(3), interpretacion(3), ley(6), motiv+(3), pendiente(2), perito(5) Class 4: Enforcement (judgment) ejecucion(14), ejecut+(15), embarg+(11), finca+(9), depositar+(6), interes+(6), pago(6), suspension(5), deposito(6), entreg+(6), quiebra(5), sentencia(9), solicit+(9), vehiculo(4), acreedor(3), administracion(4), cantidad(4), conden+(4), cost+(4), dinero(4), edicto(2), imposibilidad(3), multa(3), notificacion(4), pagar+(4)

25 May 25, Stemming: the longest string of characters that is common to different words: For all the variants of ‘love’, but also for ‘lover’ (noun), ‘lovely’ (adverb), it can offer the stem: lov+ Lemmatization respects the category: 3 different lemma: love (verb), lover (noun) lovely (adv) If we apply this process to Spanish or Catalan (or every Romanesque language), which have a high flection capacity (60 forms for verbs, without taking into account the composed forms), stemming would hide a lot of information. StemLema acumulacionacumulación acumularseacumular acumul+--- admisionadmisión admit+admitir celebracioncelebración celebr+celebrar misma+mismo mismo+--- suspendersesuspender suspend+--- EXAMPLES Stemming vs Lemmatization

26 Quantitative Comparison Stemmed Corpus Lemmatized Corpus Num. different forms Num. Ocurrences Max. Freq. Of a form Hapax Lemmatized corpus has fewer word-forms than the stemmed version. The LSI on the lemmatized corpus is able to reconstruct documents better, especially in few dimensions. The lemmatized corpus clustering is more detailed.

27 AIFB May 25, Clustering with stemmed corpus offers us 4 classes: 1.‘On-duty’ actions (mixed with Judicial Office) (54,06%) 2.Proceedings and Trial (18,10%) 3.Enforcement (judgements) (14,39%) 4.Family Law (gender violence, divorce, separation…) (13,46%) 2.Clustering with lemmatized corpus is more detailed and offers 6 classes: 1.Judicial Office (20,11%) 2.‘On-duty’ actions (27,25%) 3.Family Law (gender violence, divorce, separation…)(14,55%) 4.Proceedings (15,61%) 5.Trial (8,47%) 6.Enforcement (judgements) (14,02%) Comparision of Clustering Results

28 AIFB May 25, Take-Home Messages Do text analysis of legal documents! If you do that, Do lemmatization!

29 AIFB Methodology

30 AIFB May 25, Initial Methodology + Based on 800 competency questions + Questions were clustered + Middle-out strategy – Usage of ontology not considered – Repetitive discussions – Long discussions

31 AIFB May 25, Considering the “Why” No normative knowledge Stick to the questions as sources Model the questions, not the answers

32 May 25, Wiki visualization

33 AIFB May 25, Diligent Argumentation Ontology Argumentation ontology defined Based on Case Studies to identify the most effective types of arguments Argument type recognition based on RST

34 AIFB May 25, Methodology changes Using DILIGENT made the ontology engineering… … much faster … amenable to distributed development … better documented … trackable … better manageable Also DILIGENT itself got changed!

35 AIFB May 25, Outlook Better tool support – off-the-shelf wiki had weaknesses Moderator support in discussions Competency question clustering Gathering further experience from legal and other case studies

36 AIFB Architecture

37 May 25, High Level Requirements Judges should not be bothered with a complex user interface. A simple natural language interface is probably appropriate. The decision as to whether a new question is similar to a stored question (with its corresponding answer) should be based on semantics rather than on simple word matching. An ontology can be used to perform this semantic matching of questions. The questions included in the system should be of high quality. Be rather exhaustive and reflect the actual situation As extensive survey with more than 250 Spanish judges forms the basis for the questions. Justify the answer provided by the system with existing Jurisprudence. Jurisprudence databases. Metadata and Ontology process of documents. Knowledge Management at all levels

38 AIFB May 25, Example Question-Answer Question: What problems can we foresee with the analysis of small amounts of drugs, where the identification test destroys the drugs? Answer: This is an unrepeatable piece of evidence at the trial. In these cases, the Spanish Criminal Procedure Act states that the adversarial principle should be respected. While the trial proceedings are prepared, the judge must explain to all parties that they may choose an expert to perform these tests.

39 May 25, Court and docket number Names of the magistrates Date and place Prefatory statement History of the Case Grounds of Decision Example of judgment: parts

40 May 25, Question AnswerFAQ Judgement Summary Case History Decision Grounds Ruling OPJK PracticalKnowledgeInstances Relations between the Question/Answer & Judgment

41 AIFB May 25, Architecture Questions- Answers Expert Knowledge Semantic Matching DB 1 Decisions DB N Decisions Ontology Learning & feeding Ontology Merging Jurisprudence Ontology Alignment Web browser Natural Language

42 AIFB May 25, Expert Knowledge Retrieval Design - Technological considerations Ontology Domain Detection Keyword Matching Ontology Grapth Path Matching iFAQ System Multistage Searching Subsystem Ontology Technology Natural Language Processing Caching subsystemPersistence subsystem Eficiency Accuracy

43 May 25, Expert Knowledge Retrieval Chain of Resposability pattern FAQ Candidates FAQ User Question iFAQ Search Engine Ontology Domain Detection FAQ Search Factory Other search engines... Keyword/synonym matching stage Ontology graph path matching Plugged Searching Stages

44 May 25, Expert Knowledge Retrieval Ontology Linking NLP NL query POS list (lemmas) Semantic Distance Calculation Semantic distance Between queries Term Coverage Calculation between queries Best match of stored queries Ontology Semantic Similarity: Main steps

45 AIFB May 25, Expert Knowledge Retrieval The semantic distance is based on the weighted navigation distance between terms in the ontology. Navigation through the ontology means that one moves from one concept to another concept, via one of its relations or attributes. Is a Follows Actor Etc. The task of associating distance costs: Is a domain specific Needs to be performed by legal expert. Semantic Similarity Ontology Accuse Actions Follow Denounce Mother Son Mother

46 AIFB May 25, Conclusions Decision support system for unexperienced judges Using Semantic Web technology for handling knowledge Provide knowledge for decision making process Capture knowledge from experts Share knowledge among all users Extended understanding capacities Background knowledge: Professional Legal Ontology Decision Explanation Improved Knowledge Acquisition

47 AIFB May 25, Expert Knowledge Retrieval Terms of the input question are filtered by their part-of-speech category: Nouns, Verbs, Adjectives, and Adverbs Each term is linked to the ontology if it is possible The algorithm constructs a semantic path from each input term to terms of the stored query. Terms which are linked to the ontology Terms (user questions) linked to the ontology but no corresponding them can be found in the stored questions (ontology navigation)  Semantic distance infinitely large. Terms cannot be linked to the ontology. But have a corresponding one at the stored question (same lemma)  Distance is Zero Not corresponding lemma in the stored question  Distance is infinite. Term Coverage


Download ppt "AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez."

Similar presentations


Ads by Google