Download presentation
Presentation is loading. Please wait.
Published byCollin Morris Modified over 8 years ago
1
Advanced Information Systems Laboratory http://iaaa.cps.unizar.es Department of Computer Science and Systems Engineering 1st Workshop of COST Action C21: "Ontologies for Urban Development: Interfacing Urban Information Systems" Building an Address Gazetteer on top of an Urban Network Ontology J.Nogueras-Iso, F.J.López, J.Lacasta, F.J.Zarazaga-Soria, P.R.Muro-Medrano Geneva, 6-7 November 2006
2
2 Outline 1. Introduction 2. A typical use-case: IDEZar 3. Ontology building using a manual mapping 4. Ontology building using an automated approach 5. Conclusions
3
3 1. Introduction The increasing relevance of geographic information for decision-making and resource management in diverse areas promoted the creation of Spatial Data Infrastructures (SDI) SDI: a coordinated approach to technology, policies, standards, and human resources necessary for the effective acquisition, management, distribution and utilization of GI at different organization levels and involving both public and private institutions Gazetteer Service A typical component of an SDI Directory of instances of a class or classes of features containing some information regarding position Looks up geographic feature locations based on geographic identifiers
4
4 Address Gazetteer Service In SDIs for local administrations such as a city council, address gazetteer services represent one of the most important services that the councils must offer to their citizens An Address Gazetteer Service Specialized on Urban Network Features (addresses) The councils are responsible for the management of urban networks, and these networks are used as reference information for other services at national level such as cadaster or census services
5
5 Creation of the contents of a gazetteer It usually requires combining multiple repositories The same feature (concept) is stored in different repositories, each of them contributing with a different piece of attribute information Typical problems of heterogeneity Different data models (roles, granularity), encoding Our proposal to deal with heterogeneity in this context: Build an urban network ontology upon existing feature types taxonomies
6
6 2. A typical use-case: IDEZar The IDEZar Project is the result of a collaboration agreement signed in March 2004 between the City Council and the University of Zaragoza Zaragoza is a medium-sized city (some 650000 inhabitants), in the northeast of Spain (capital of Aragón), growing fast in extension and population. The municipality is about 1000 km2 and includes several towns Objective: development of a local SDI for Zaragoza To facilitate, increase and coordinate the use of spatial data by the Council To develop applications for the citizens and to provide them with access to public sector information
7
7 IDEZar Service Architecture http://www.zaragoza.es/idezar/ > Catalog > Urban-Thematic Public services (libraries, police stations...) Private services (pharmacies, parkings...) > Base Street maps > Environment-Thematic Agenda 21, protected areas... > Street names > Arriving at Zaragoza IDEZar (Local SDI) > IDEE-Nomenclátor Toponyms > IDEE-Base Base map up to 1:25000 of Spain IDEE (National SDI) > Base Orthoimages IDEAr (Aragón – Regional SDI) GeoPortal Street Map and Gazetteer
8
8 Address related repositories Multiple repositories Not very different models Feature = name + type + additional info (location, range, …) But different taxonomies for urban network feature types Not specially synchronized Zaragoza City Council Informatics Office AYTO National Statistics Institute TVIAN National Cadaster Office SIGLA Tax Office SIGLA Urban Planning Office AYTO,SIGLA IDEZar AYTO Electoral Census Inhabitant Census Addresses Property Census Amends (streets, addresses) Site development updates Town planning updates Addresses updates Street names Addresses Maps Street types Street names Addresses Maps Addresses ranges Statistics Office TVIAN
9
9 Address related repositories Statistics Office repository Inhabitant/poll census, exchanges from/to National Statistics Institute TVIAN (Tipo de Vía Normalizada): standardized network feature types of the National Statistics Institute Zaragoza City Council Informatics Office AYTO National Statistics Institute TVIAN National Cadaster Office SIGLA Tax Office SIGLA Urban Planning Office AYTO,SIGLA IDEZar AYTO Electoral Census Inhabitant Census Addresses Property Census Amends (streets, addresses) Site development updates Town planning updates Addresses updates Street names Addresses Maps Street types Street names Addresses Maps Addresses ranges Statistics Office TVIAN
10
10 Address related repositories Cadaster Office repository Land/Tax management, exchanges from/to National Cadaster Office SIGLA: network feature types of the Cadaster office Zaragoza City Council Informatics Office AYTO National Statistics Institute TVIAN National Cadaster Office SIGLA Tax Office SIGLA Urban Planning Office AYTO,SIGLA IDEZar AYTO Electoral Census Inhabitant Census Addresses Property Census Amends (streets, addresses) Site development updates Town planning updates Addresses updates Street names Addresses Maps Street types Street names Addresses Maps Addresses ranges Statistics Office TVIAN
11
11 Address related repositories Informatics Office repository Central repository used for assignation of new street names AYTO: Network feature types of the council Zaragoza City Council Informatics Office AYTO National Statistics Institute TVIAN National Cadaster Office SIGLA Tax Office SIGLA Urban Planning Office AYTO,SIGLA IDEZar AYTO Electoral Census Inhabitant Census Addresses Property Census Amends (streets, addresses) Site development updates Town planning updates Addresses updates Street names Addresses Maps Street types Street names Addresses Maps Addresses ranges Statistics Office TVIAN
12
12 Gazetteer content creation Why do we need to combine both 3 repositories? Not all features are in the 3 repositories Attribute information is distributed in the different repositories
13
13 Gazetteer content creation II Problems found while combining Matching can not be based uniquely on feature names 2 features may differ in typology but not in name (Spain square vs Spain avenue) Which is the most appropriate feature type taxonomy for the gazetteer contents? Solution proposed: define a urban network ontology An ontology defines explicitly the concepts and relations between these concepts in a domain This ontology will provide a unified model of the feature types that can be found in this domain Making the necessary mappings to the particular taxonomies use in the different council offices or external organizations
14
14 How to build up the ontology The construction of ontologies upon existing vocabularies is a classical and widely used approach The underlying problem (ontology alignment) How to find the relationships that hold between the entities represented in different taxonomies Two approaches for the ontology construction Manual mapping approach Automated approach TVIAN AYTOSIGLA
15
15 “PZ” “PL” SQUARE “CN” RESIDENTIAL DEVELOPMENT “CM” MINOR ROAD COUNTRY HOUSE (SOUTH OF SPAIN) MINOR ROAD “CN” “CL” STREET “CLP” “CL” “CLTP” “AN” STREET PEDESTRIAN STREET PEDESTRIAN STREET SEGMENT SIGLA (Cadaster) AYTO (City Council) Concepts Acronyms 3. Manual Mapping approach Matching of terms (names + acronyms) between the different taxonomies Difficulties: lack of semantic descriptions Categories of matches Exact match Partial match: one concept is broader or narrower No match Provisional match: taxonomy errors (homonyms) imply erroneous matches TVIANAYTOSIGLA
16
16 A more flexible approach Previous approach Too time expensive and with little scalability Improvement Use of well-established shared common core ontology and make mappings between the distinct sources and this common core New experiment: Use of URBISOC thesaurus a thesaurus focused on Spanish terminology for Town Planning developed by the CINDOC/CSIC institute (Centre for Scientific Information and Documentation / Spanish National Research Council) TVIAN AYTOSIGLA URBISOC
17
17 A more flexible approach II Use of Towntology ontology editor Focused on ontology construction Storage of concepts with several definitions that are in a process of selection and characterization Although improving scalability, still time expensive and error prone
18
18 4. Ontology building using an automated approach Why? Manual mappings are time expensive Some mappings may not be successful because content creators have not assigned the correct feature type Technique proposed Formal Concept Analysis (1980, Wille &Ganter …) It enables the extraction of a hierarchy of concepts from the feature instances contained in the source repositories TVIAN AYTOSIGLA generated
19
19 Basics of FCA Definition of formal contexts, triple (G,M,I) G: objects M: attributes I: binary relation between G and M, incidence matrix It is possible to extract formal concepts Given A G and B M, a pair (A,B) is a formal concept if and only if the set of all attributes shared by the objects in A is identical with B A is also the set of all the objects which have in common with each other the attributes in B Additionally it is possible to establish a subconcept- superconcept relation (A1,B1) (A2,B2) A1 A2 ( B2 B1)
20
20 Applying FCA How to obtain a unique repository of instances, i.e. the formal context required by FCA? Traditional datalinking has been applied to the feature instances contained in the different databases based on the analysis of the lexical and spatial similarities of feature attributes Transform the datalinking matrix into the incidence matrix Each checked cell (match of source features) generates an object/instance in the incidence matrix The columns correspond with the transformation of urban network feature type codes (e.g., AYTO CODE, SIGLA CODE) into proper attributes with boolean values
21
21 Incidence matrix Datalinking matrix Replace by code 2718 features 18 AYTO codes 4318 features 35 SIGLA codes
22
22 Applying FCA Obtain the concept lattice NEXT CLOSED SET algorithm (Ganter 87) Concept Lattice Incidence matrix FCA Only attributes supremum (least common superconcept) AYTO_PL SIGLA_PZ (square) SIGLA_AV (avenue) SIGLA_CL AYTO_CL (traffic allowed street) infimum (greatest common subconcept) … SIGLA_CL AYTO_AN (carfree designed street) AYTO_AV SIGLA_AV (traffic allowed avenue) SIGLA_AV AYTO_AVP (pedestrian avenue) SIGLA_CL (street) SIGLA_CL AYTO_CLP (pedestrianized street)
23
23 Results Experiment: combining COUNCIL_FEATURE and CADASTER_FEATURE databases A concept lattice of 36 concepts from the original 53 concepts Identification of equivalent concepts in in both taxonomies, e.g., square (PL in AYTO and PZ in SIGLA) And also subconcept-superconcept relations. E.g., identification of street as a broader concept in SIGLA (CL), which has narrower concepts in the AYTO traffic-allowed streets (CL) pedestrianized streets (CLP) Or carfree-designed streets (AN).
24
24 5. Conclusions FCA approach seems to be more flexible Dynamic building of the ontology (at least, a draft) We don’t need to define the concepts, we just need to observe the data that exists We have created a domain specific ontology that facilitate the interoperability (synchronization, update and merge) of the separate repositories Future lines Improve the efficiency of the method Enrich the generated concepts with commonalities found in other feature attributes of the instances (e.g., geometry, perimeter, area) Apply to other domains Hydrology: NMA vs Water Agency repositories
25
25 Advanced Information Systems Laboratory http://iaaa.cps.unizar.es
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.