Presentation is loading. Please wait.

Presentation is loading. Please wait.

AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez.

Similar presentations


Presentation on theme: "AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez."— Presentation transcript:

1 http://www.sekt-project.com AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez Workshop on Artificial Intelligence and Law XXII World Congress of Philosophy of Law and Social Philosophy Granada, May 2005

2 http://www.sekt-project.com AIFB May 25, 2005 2 Introduction to SEKT Project and Legal Case Study Methodology OPJK Improving knowledge discovery on the competency questions Architecture Agenda

3 http://www.sekt-project.com May 25, 2005 3 The inSEKTs BT University of Sheffield Vrije Universiteit Amsterdam Sirma AI Empolis Universität Karlsruhe Ontoprise Universitat Autònoma de Barcelona Universität Innsbruck Jozef Stefan Institute iSOCO Kea-pro

4 http://www.sekt-project.com AIFB May 25, 2005 4 SEKT Main goals of SEKT European Leadership in Semantic Technologies Core Research Combine Human Language Technologies, Knowledge Discovery and Ontology Technologies Provide intelligent knowledge access

5 http://www.sekt-project.com AIFB May 25, 2005 5 Description of the Problem: Legal Domain In General: Complaint about diligence of legal administration. The Judges are overworked. In Particular: New Judges A lot of theoretical knowledge, but few practical knowledge On Duty. When they are confronted with situations in which they are not sure what to do “Disturb” experienced judges with typical questions. Usually his/her former tutor (Preparador) Existing Technology Legal Databases Essential in their daily work Based on keywords and boolean operators A search retrieves a huge number of hits

6 http://www.sekt-project.com May 25, 2005 6 Description of the Problem: Legal Domain Solution: Design an intelligent system to help new judges with their typical problems. Extended FAQ system using Semantic Web technologies Connect the FAQ system with the exiting jurisprudence. Search Jurisprudence using Semantic Web technologies.

7 http://www.sekt-project.com AIFB May 25, 2005 7 LLD [Language for Legal Discourse, L.T. McCarty, 1989]: Atomic formula, Rules and Modalities. NOR [Norma, R.K. Stamper, 1991, 1996]: Agents Behavioral invariants, Realizations. LFU [Functional Ontology for Law, R.W. van Kranlinger; P.R.S. Visser, 1995]: Normative Knowledge, World knowledge, Responsibility knowledge, Reactive knowledge and Creative knowledge. FBO [Frame-Based Ontology of Law, A. Valente, 1995]: Norms, Acts and Concepts Descriptions]. LRI-Core Legal Ontology [J. Breuker et al., 2002]: Objects, Processes, Physical entities, Mental entities, Agents, Communicative Acts. IKF-IF-LEX Ontology for Norm Comparaison [A. Gangemi et al., 2001]: Agents, Institutive Norms, Instrumental provisions; Regulative norms; Open-textured legal notions, Norm dynamics. State of the Art in Legal Ontologies

8 http://www.sekt-project.com AIFB May 25, 2005 8 Professional Knowledge (PK) Legal Knowledge (LK)  Legal Core Ontologies (LCO) [based on General Theories of Law] Legal Professional Knowledge (LPK)  OPLK Judicial Professional Knowledge (JPK)  OPJK Conceptual distinctions

9 http://www.sekt-project.com May 25, 2005 9 14 7 5 29 8 16 8 8 10 12 10 61 16 Total Autonomous Communities: 14 (out of 17) Ethnographic survey

10 http://www.sekt-project.com AIFB May 25, 2005 10 Statistical analysis of results Judicial units: heterogeneity Judge’s profile Protocols of analysis Literal transcripts Completed questionnaires List of extracted questions Preliminary exploitation of data

11 http://www.sekt-project.com AIFB May 25, 2005 11 Identification of possible concepts through ALCESTE’s results and TextToOnto conceptual distribution Domain detection Competency questions discussion and concept extraction OPJK Modeling

12 http://www.sekt-project.com AIFB May 25, 2005 12 JUDGE ON-DUTY FAMILY ISSUES IMMIGRATION REAL ESTATE DECISION- MAKING & JUDGMENTS PROCEEDINGSJUDICIAL CLERKS COMMERCIAL LAW CONTRACT LAW CRIMINAL LAW GENDER VIOLENCE ORDER OF PROTECTION / INJUNCTION Intuitive ontological subdomains

13 http://www.sekt-project.com AIFB May 25, 2005 13 Term extraction using TextToOnto

14 http://www.sekt-project.com AIFB May 25, 2005 14 Term extraction using TextToOnto and Spanish Gate

15 http://www.sekt-project.com AIFB May 25, 2005 15 1.Identify important concepts that should be represented 2.Hierarchy construction 3.Identify relations between them 4.Redefine the ontology repeting steps 1-4

16 http://www.sekt-project.com AIFB May 25, 2005 16 Selecting (underlying) all the nouns (usually concepts) and adjectives (usually properties) contained in the competency questions. ¿Cuál es el tratamiento de las denuncias manifiestamente inverosímiles o relativas a hechos que evidentemente carecen de tipicidad? ¿Y si se trata de una querella que reúne todos los demás presupuestos procesales pero los hechos objeto de la misma carecen de relevancia penal o manifiestamente falsos? ¿Qué ocurre si comparece en el juzgado una persona que quiere denunciar hechos difícilmente creíbles, sin relación entre sí, dudándose por el juez de la capacidad mental del denunciante? ¿Ante quién debe interponerse el recurso de reforma contra la prisión, delante del juez de guardia o del juez que dictó el correspondiente auto de prisión? Competency question discussion

17 http://www.sekt-project.com AIFB May 25, 2005 17 OPJK classes identified

18 http://www.sekt-project.com AIFB May 25, 2005 18 OPJK and Proton Integration

19 http://www.sekt-project.com AIFB Improving knowledge discovery on the competency questions

20 http://www.sekt-project.com AIFB May 25, 2005 20 Data: 3 text corpora (judges’ questions): Corpus 1: Scholar “on duty” questions (Spanish Judicial School = 99) Corpus 2: Practical “on duty” questions (= 163) (field work) Corpus 3: All practical questions (=756)(field work) Method: TEXT GARDEN (J. Stefan Institute, Ljubljana) ALCESTE -Analysis of the co-occurring lexemes within the simple statements of a text [Reinert 2002, 2003] Data and Method

21 http://www.sekt-project.com AIFB May 25, 2005 21 The text needs to be represented in an appropriate way for statistical analysis: 1.Breaking text into “units” (lines, sentences, …) 2.Morphological categorization (adjectives, prepositions, …) 3.Putting words into canonical form: a)Lemmatization (is,was,are → be) b)Stemming (loved, loving → lov+) 4.Analysis: a)Clustering b)Latent semantic indexing c)Correspondence analysis d)Classification e)Visualization Analysis of Text

22 ALCESTE (Reinert,1988) Corpus Segmented in chunks Classes of related chunks List of typical words related to each class {} {} {} Geometric representation Hierarchical descending clustering Correspondence analysis Folch & Habert (2000)

23 Example of Correspondence Analysis and Visualization +-----|---------|---------|---------+---------|---------|---------|-----+ 20| solo| | 19| | parte+ | 18| | monitorio demand+ | 17| | archiv+accion+ | 16| present+ | falta+ vehiculo+fase+ | 15| | seguir procurador+ | 14| |recurso+ pago+quiebra+ | 13| ofici+| gasto+..ejecut+ejecucion+ | 12| sido dia+.finca+embarg+verbal+ | 11| interes+traficoacto+.notificacionentrega+ | 10| momentocelebr+hall+ cuantia+resolver | 9 | valor+ |auto+admit+qued+.juicio+deposit+ | 8 | lesion+ venirdinero.. notific+pericial+ | 7 | | si vista+aport+inform+ | 6 madreacord+viviend+ | cabo solicit+ | 5 | victima+maridoempresa+ | llev+ ya prueba+abogado+ | 4 |..tratosproteccion | | 3 |.senor+alejamiento | responsabili | 2 tema+mujer+malo+violencia | | 1 | denunci+medida+visitas | | 0 +--.separacion+orden+---------------+-----venirfiscal+------------------+ 1 | pidepresun+ | | 2 | | | 3 | | | 4 | | | 5 | | | 6 | | | 7 | dict+ | | 8 | | | 9 | | | 10| | | 11| | | 12| | | 13| | | 14| | un | 15| | | 16| | levantamient | 17| | tenerdeten+ libertadforense | 18| |person+.....hacercausa+asunto+ | 19| servicio+......judicial+actuacion+ | 20| guardia+. juezllam+...policiadetenido+ | 21| | partido+ | +-----|---------|---------|---------+---------|---------|---------|-----+ ALCESTE TEXT GARDEN

24 Example of Clustering Class 1: Judicial unit funcionar+ (21), juzgar(26), oficina(11), trabaj+(13), decir(26), llam+(16), mand+(12), acudir(11), adjunto(4), busc+(4), consult+(4), dato(6), hablar(4), jurisprudencia(3), local+(3), material(6), necesit+(7), policia(14), prensa(4), sala(4), funerari+(2), hurto(3), informacion(5), miedo(3), robo(3), servicio+(7), sustitu+(4), tecnico(2), venir(15) Class 2: Family law alejamiento(22), malo(22), medida(16), orden+(23), proteccion(17), senor+(13), trat+(22), victima(11), mujer(11), padre(7), denunci+(12), domestico(8), violencia(8), agresor(4), dict+(10), madre(7), marido(6), nino(5), pension(4), psicolog+(5), separacion(5), abus+(5), alimento(3), ayud+(4), casa(3), cautelar+(3), divorcio(2), empresa(3), hijo(4), lesion+(6) Class 3: Proceedings escrit+(9), fiscal+(13), instruccion(9), ordinario(5), seguir(11), acumular(5), audiencia-provincia(2), conform+(2), contradictori+(3), criterio+(10), cuantia(5), falt+(7), injusto(3), interpretacion(3), ley(6), motiv+(3), pendiente(2), perito(5) Class 4: Enforcement (judgment) ejecucion(14), ejecut+(15), embarg+(11), finca+(9), depositar+(6), interes+(6), pago(6), suspension(5), deposito(6), entreg+(6), quiebra(5), sentencia(9), solicit+(9), vehiculo(4), acreedor(3), administracion(4), cantidad(4), conden+(4), cost+(4), dinero(4), edicto(2), imposibilidad(3), multa(3), notificacion(4), pagar+(4)

25 http://www.sekt-project.com May 25, 2005 25 Stemming: the longest string of characters that is common to different words: For all the variants of ‘love’, but also for ‘lover’ (noun), ‘lovely’ (adverb), it can offer the stem: lov+ Lemmatization respects the category: 3 different lemma: love (verb), lover (noun) lovely (adv) If we apply this process to Spanish or Catalan (or every Romanesque language), which have a high flection capacity (60 forms for verbs, without taking into account the composed forms), stemming would hide a lot of information. StemLema acumulacionacumulación acumularseacumular acumul+--- admisionadmisión admit+admitir celebracioncelebración celebr+celebrar misma+mismo mismo+--- suspendersesuspender suspend+--- EXAMPLES Stemming vs Lemmatization

26 Quantitative Comparison Stemmed Corpus Lemmatized Corpus Num. different forms 30742064 Num. Ocurrences 1986119946 Max. Freq. Of a form 12302208 Hapax1666934 Lemmatized corpus has fewer word-forms than the stemmed version. The LSI on the lemmatized corpus is able to reconstruct documents better, especially in few dimensions. The lemmatized corpus clustering is more detailed.

27 http://www.sekt-project.com AIFB May 25, 2005 27 1.Clustering with stemmed corpus offers us 4 classes: 1.‘On-duty’ actions (mixed with Judicial Office) (54,06%) 2.Proceedings and Trial (18,10%) 3.Enforcement (judgements) (14,39%) 4.Family Law (gender violence, divorce, separation…) (13,46%) 2.Clustering with lemmatized corpus is more detailed and offers 6 classes: 1.Judicial Office (20,11%) 2.‘On-duty’ actions (27,25%) 3.Family Law (gender violence, divorce, separation…)(14,55%) 4.Proceedings (15,61%) 5.Trial (8,47%) 6.Enforcement (judgements) (14,02%) Comparision of Clustering Results

28 http://www.sekt-project.com AIFB May 25, 2005 28 Take-Home Messages Do text analysis of legal documents! If you do that, Do lemmatization!

29 http://www.sekt-project.com AIFB Methodology

30 http://www.sekt-project.com AIFB May 25, 2005 30 Initial Methodology + Based on 800 competency questions + Questions were clustered + Middle-out strategy – Usage of ontology not considered – Repetitive discussions – Long discussions

31 http://www.sekt-project.com AIFB May 25, 2005 31 Considering the “Why” No normative knowledge Stick to the questions as sources Model the questions, not the answers

32 http://www.sekt-project.com May 25, 2005 32 Wiki visualization

33 http://www.sekt-project.com AIFB May 25, 2005 33 Diligent Argumentation Ontology Argumentation ontology defined Based on Case Studies to identify the most effective types of arguments Argument type recognition based on RST

34 http://www.sekt-project.com AIFB May 25, 2005 34 Methodology changes Using DILIGENT made the ontology engineering… … much faster … amenable to distributed development … better documented … trackable … better manageable Also DILIGENT itself got changed!

35 http://www.sekt-project.com AIFB May 25, 2005 35 Outlook Better tool support – off-the-shelf wiki had weaknesses Moderator support in discussions Competency question clustering Gathering further experience from legal and other case studies

36 http://www.sekt-project.com AIFB Architecture

37 http://www.sekt-project.com May 25, 2005 37 High Level Requirements Judges should not be bothered with a complex user interface. A simple natural language interface is probably appropriate. The decision as to whether a new question is similar to a stored question (with its corresponding answer) should be based on semantics rather than on simple word matching. An ontology can be used to perform this semantic matching of questions. The questions included in the system should be of high quality. Be rather exhaustive and reflect the actual situation As extensive survey with more than 250 Spanish judges forms the basis for the questions. Justify the answer provided by the system with existing Jurisprudence. Jurisprudence databases. Metadata and Ontology process of documents. Knowledge Management at all levels

38 http://www.sekt-project.com AIFB May 25, 2005 38 Example Question-Answer Question: What problems can we foresee with the analysis of small amounts of drugs, where the identification test destroys the drugs? Answer: This is an unrepeatable piece of evidence at the trial. In these cases, the Spanish Criminal Procedure Act states that the adversarial principle should be respected. While the trial proceedings are prepared, the judge must explain to all parties that they may choose an expert to perform these tests.

39 http://www.sekt-project.com May 25, 2005 39 Court and docket number Names of the magistrates Date and place Prefatory statement History of the Case Grounds of Decision Example of judgment: parts

40 http://www.sekt-project.com May 25, 2005 40 Question AnswerFAQ Judgement Summary Case History Decision Grounds Ruling OPJK PracticalKnowledgeInstances Relations between the Question/Answer & Judgment

41 http://www.sekt-project.com AIFB May 25, 2005 41 Architecture Questions- Answers Expert Knowledge Semantic Matching DB 1 Decisions DB N Decisions Ontology Learning & feeding Ontology Merging Jurisprudence Ontology Alignment Web browser Natural Language

42 http://www.sekt-project.com AIFB May 25, 2005 42 Expert Knowledge Retrieval Design - Technological considerations Ontology Domain Detection Keyword Matching Ontology Grapth Path Matching iFAQ System Multistage Searching Subsystem Ontology Technology Natural Language Processing Caching subsystemPersistence subsystem Eficiency Accuracy

43 http://www.sekt-project.com May 25, 2005 43 Expert Knowledge Retrieval Chain of Resposability pattern FAQ Candidates FAQ User Question iFAQ Search Engine Ontology Domain Detection FAQ Search Factory Other search engines... Keyword/synonym matching stage Ontology graph path matching Plugged Searching Stages

44 http://www.sekt-project.com May 25, 2005 44 Expert Knowledge Retrieval Ontology Linking NLP NL query POS list (lemmas) Semantic Distance Calculation Semantic distance Between queries Term Coverage Calculation between queries Best match of stored queries Ontology Semantic Similarity: Main steps

45 http://www.sekt-project.com AIFB May 25, 2005 45 Expert Knowledge Retrieval The semantic distance is based on the weighted navigation distance between terms in the ontology. Navigation through the ontology means that one moves from one concept to another concept, via one of its relations or attributes. Is a Follows Actor Etc. The task of associating distance costs: Is a domain specific Needs to be performed by legal expert. Semantic Similarity Ontology Accuse Actions Follow Denounce Mother Son Mother

46 http://www.sekt-project.com AIFB May 25, 2005 46 Conclusions Decision support system for unexperienced judges Using Semantic Web technology for handling knowledge Provide knowledge for decision making process Capture knowledge from experts Share knowledge among all users Extended understanding capacities Background knowledge: Professional Legal Ontology Decision Explanation Improved Knowledge Acquisition

47 http://www.sekt-project.com AIFB May 25, 2005 47 Expert Knowledge Retrieval Terms of the input question are filtered by their part-of-speech category: Nouns, Verbs, Adjectives, and Adverbs Each term is linked to the ontology if it is possible The algorithm constructs a semantic path from each input term to terms of the stored query. Terms which are linked to the ontology Terms (user questions) linked to the ontology but no corresponding them can be found in the stored questions (ontology navigation)  Semantic distance infinitely large. Terms cannot be linked to the ontology. But have a corresponding one at the stored question (same lemma)  Distance is Zero Not corresponding lemma in the stored question  Distance is infinite. Term Coverage


Download ppt "AIFB Iuriservice II Ontology Development Núria Casellas, Denny Vrandečić, Joan Josep Vallbé, Aleks Jakulin, Mercedes Blázquez."

Similar presentations


Ads by Google