OFFICIAL AND CROWDSOURCED GEOSPATIAL DATA INTEGRATION Searching solutions to improve the processes in cartography updating By Jimena Martínez Supervisors:

Slides:



Advertisements
Similar presentations
Map matching algorithm for data conflation – an open source approach
Advertisements

Combining OS and OSM Data - A Case Study for geospatial data integration Name: Du Heshan Supervisor: Dr Suchith Anand.
The Process of Data Ingestion in ÆKOS Andrew Graham and Matt Schneider TERN Ecoinformatics Data Analysts Logos used with consent. Content of this presentation.
The Role of Error Map and attribute data errors are the data producer's responsibility, GIS user must understand error. Accuracy and precision of map and.
Compiling Web Scripts for Apache Jacob Matthews Luke Hoban Robby Findler Rice University.
Large-Scale Entity-Based Online Social Network Profile Linkage.
INTEGRATING AUTHORITATIVE AND VOLUNTEERED GEOGRAPHIC INFORMATION - AN ONTOLOGICAL APPROACH Crowd Sourcing in National Mapping Internship Funding ACTIVITY.
Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 9. Geographic Data Modeling.
From portions of Chapter 8, 9, 10, &11. Real world is complex. GIS is used model reality. The GIS models then enable us to ask questions of the data by.
Smoothing Linework June 2012, Planetary Mappers Meeting.
School of Environmental Sciences University of East Anglia
ADVANCED FIRE INFORMATION SYSTEM AFIS I AFIS is the 1 st satellite based, near real time fire information system developed to fulfill the needs of both.
How Many Volunteers Does It Take To Map An Area Well? Dr Muki Haklay Department of Civil, Environmental and Geomatic Engineering, UCL
 Image Search Engine Results now  Focus on GIS image registration  The Technique and its advantages  Internal working  Sample Results  Applicable.
Contributors of Volunteered Geographic World: Motivation behind Contribution “Volunteered Geographic Information (VGI) is the harnessing of tools to create,
Advanced Technical Writing Lecture 8 Memorandums 29 June 2008.
9. GIS Data Collection.
Face Detection and Neural Networks Todd Wittman Math 8600: Image Analysis Prof. Jackie Shen December 2001.
Selective Sampling on Probabilistic Labels Peng Peng, Raymond Chi-Wing Wong CSE, HKUST 1.
Mapping of mountain pine beetle red-attack forest damage: discrepancies by data sources at the forest stand scale Huapeng Chen and Adrian Walton.
Data Quality Issues-Chapter 10
OSM and CityGuide. Quality Assurance for Navigation Software Kirill “Zkir” Bondarenko SotM Baltic, 2013.
ANVIL – A Rough Idea Martin Ford – ISLinkup (for GEOBASE + OGCE Team)
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Different kinds of data Example of Elevation Topographic DEM (Digital Elevation Model)
Land Cover Classification Defining the pieces that make up the puzzle.
Presenter: Rich Lee Location Suitability Analysis New Burger stores in San Fernando Valley 2010 Fall 406 Final Project.
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
Office of Coast Survey TAGGING COAST PILOT FEATURES Tom Loeper, NOAA Great Lakes Navigation Manager Chief, Coast Pilot Branch 2 July 2015.
Vvisual comparision of data Measuring the quality of Volunteered Geographic Information (VGI) datasets such as OpenStreetMap is often attempted without.
ELEKSPOT: EVALUATION PLAN Minkyu Lee Agenda  Project Goal  Objective of Evaluation  Case Study: OpenStreetMap  Quality of GI  Phases.
SharePoint, The Semantic Web, Serendipity, Search & Metadata.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Semi Automatic Image Classification through Image Segmentation for Land Cover Classification Pacific GIS/RS Conference November 2013, Novotel Lami Vilisi.
Integration of OpenStreetMap into ArcGIS Al Pascual.
Patrick Revell Ordnance Survey Research
Advanced Technical Writing Lecture 4 Memorandums.
Interoperable Visualization Framework towards enhancing mapping and integration of official statistics Haitham Zeidan Palestinian Central.
Ⓒ AYLESBURY VALE DISTRICT COUNCIL Moving with OS MasterMap A proposed methodology for using MasterMap to manage changes Martyn Sutcliffe OS PAI/Change.
Okalo Daniel Ikhena Dr. V. Z. Këpuska December 7, 2007.
Analysis. Solution Requirements 1. Identify the functions and attributes of the website. 2. Write a problem statement. (What is the problem? What will.
GIS Data Structures How do we represent the world in a GIS database?
Understanding User’s Query Intent with Wikipedia G 여 승 후.
Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003.
U.S. Department of the Interior U.S. Geological Survey Exploring New Ground Data Sources GFSAD30 April 2015 Meeting Justin Poehnelt, Student Developer.
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
Robust Real Time Face Detection
ArcGIS Editor for OpenStreetMap: Contributing Data Christine White.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Predicting popular areas of a tiled Web map as a strategy for server-side caching Sterling Quinn.
Topology Relationships between features: Supposed to prevent:
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Anatoliy Lyashchenko Research Institute of Geodesy and Cartography, Lyubov Stelmakh State Statistics Committee of Ukraine UN EGM GIS New York, 29 May –
INSTITUTO NACIONAL DE ESTATÍSTICA Census 2011 Mapping Portuguese Process United Nations EGM on Contemporary Practices in Census Mapping and Use of GIS.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Esri UC 2014 | Technical Workshop | Editing in ArcMap: An Introduction Lisa Stanners, Phil Sanchez.
Developing Smart objectives and literature review Zia-Ul-Ain Sabiha.
Automated Geo-referencing of Images Dr. Ronald Briggs Yan Li GeoSpatial Information Sciences The University.
Compile your monthly P-Card Packet 1,2,3 For in depth screenshots and details on Step 1 click here 1.Open your transactions for the Current Billing Cycle.
Automatic Large Scale Topographical Map Updating using Open Street Map (OSM) Data within NoSQL Database Platform 19th AGILE Conference Helsinki, June.
Methods for Mapping Impervious Surfaces
MVP OSM a tool that allows to individuate the areas of high activity based on level of detail Maurizio Napolitano SoNet Group–
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
An OpenStreetMaps & INSPIRE databases story
Data Management: The Data Repatriation Re-integration Step or …
Geography 413/613 Lecturer: John Masich
Increase your users productivity through Office 365 user profiles
Deep SEARCH 9 A new tool in the box for automatic content classification: DS9 Machine Learning uses Hybrid Semantic AI ConTech November.
How geospatial information adds value to existing sub-national data and territorial typologies Valeriya Angelova-Tosheva, Eurostat,
Driving Successful Projects
Presentation transcript:

OFFICIAL AND CROWDSOURCED GEOSPATIAL DATA INTEGRATION Searching solutions to improve the processes in cartography updating By Jimena Martínez Supervisors: Antonio Vázquez and Marianne de Vries

Table of contents BackgroundProblemsThe idea The steps to develop the idea, and an example to show it 2

BCN200 National (Spain) Provinces 1/ years € BTN25 National (Spain) Sheets 1/ years € BTA5 Local (Spanish provinces) Sheets 1/ years € MGCP International (Africa, Middle East) Cells (208 Spain: 6 countries) 1/ years Spanish budget: € Background Scope Cartography units ScaleUpdating cycleBudget 3

Problems 1. Why official cartography is never enough updated? Satellite/ aerial images collecting date 1st real change Dec Feb May 2011 Dec Release date (2011 version) 2nd real change Update process Off. Data reflects 1st change 4

Problems 2. Why updating process is such long and expensive? Traditional updating process Vector cartography from last year. Set of data sources against which compare the cartography (images, maps, raster, vector) Reviewing the whole cartography unit. Too much time to review, not much time to edit features. 5

Problems 2. Why updating process is such long and expensive? Madrid case (1/200k) Time to update: 4 weeks 1 person Features edited percentage: 30% Time to edit this features: 1,5 weeks Would be possible to save the other 3,5 weeks? 6

1. Why official cartography is never enough updated? Traditional process based on different data sources. Data sources have different dates (collecting dates) 2. Why updating process is such long and expensive? Reviewing the whole cartography against different data sources is needed… to detect changes.As a result:Long processExpensive process Not always useful result (if highly updated cartography is needed) Problems 7

To develop a general methodology to decide whether crowdsourced (OpenStreetMap) and official geodata could be integrated or not in order to use OSM to improve the official cartography updating process. A system that finds where the official dataset need to be updated, and which type of update needs each feature, without reviewing the whole cartography unit. Saving costs and obtaining better updated cartography The idea 8

OSM data Official data Better updated features (not always) Vector formatNot completeNot homogeneous OSM to indicate where to update Data sources in the updating process 9

NMAs official data Crowdsourced data (OSM) Government (NMA) Users/NMAs/companies MAP v.1. MAP v.1…v.n Tenders (companies) MAP v.2 Updating processes Updating & production processes Months/ Years Hours/days The idea Differences in updating processes 10

Satellite/ aerial images collecting date OSM update Dec Jan. 2011Feb May 2011 Dec Release date (2011 version) June 2011 OSM update Update process Off. Data reflects 1st change The idea 1st real change OSM reflects 1st change OSM reflects 2nd change Differences in updating processes 2nd real change 11

Which one is better? OSM data Official data OSM data The idea Which dataset is “better”? Some studies ( Haklay 2008, Zielstra&Zipf, 2010 ) take this data set as the “truth” against which to compare OSM As a result OSM is not 100% complete But, what happens with that? The desired result will be: Types of updates Differences 12

HOW to integrate OSM and official data? Matching data models in a reference semantic supra model (INSPIRE?- ontologies?) Quality indicators (traditional and Crowd quality parameters) WHAT features from OSM? OSM not as features to take, but as indicators to use. If not useful, not used: types of updates. WHY OSM? Amount of data.Updated data. Accuracy data (Linus Law). Comparative studies AIM 1AIM 2 The idea Questions to answer AIM 3 “Given enough eyeballs, all bugs are shallow” 13

Matching process (feature classes filter) Official data set Feature class 1 Feature class 2... Feature class n Reference semantic model (INSPIRE?) INPUT specifications Feature class 1 (50) Feature class 2 (80)... Feature class n (N) Specifications WEB Candidates OSM data set QC and QA (features filter) Feature class 1 (30)... Feature class n (N-M) Feature class 1 Feature class 2... Feature class n Feature class 1 (50) Feature class 2 (80)... Feature class n (N) Updating process “Updating gaps” Types of updates ISO Crowd Quality VGI teams/ Online updating Update OSM The idea: the proposed system 14

The steps to reach the goal And an example to show them Making the matching between data models and features. 1 To study Quality parameters to decide which features could be used. 2 Proposing a new updating process based on flags and types of updates. 3 15

NMA data modelOSM data model FormatDatabase, shpXML (.osm) (Geometric) Primitives Node Arc Face Node Way Tag Relations Feature classTable, filePrimary tag (key) Feature (each object)RowPrimary tag (value) AttributeColumnTag (key) Values (domains)CellsTag (value) 1st step: making the matching Comparing data models 16

1st step: making the matching An approach (based on H. Uitemark) Real world 2. OpenStreetMap 1. Official dataset A2: building, church B2: building, school C2, D2: highway, motorway A1: building of interest C1: motorway D1: toll motorway A D A1 D1 A2 D2 B B2 Candidates: {[(A1,A2), (A1,B2)], [(C1,C2), (C1,D2)], [(D1,C2), (D1,D2)]} Legend C C2 C1 17

1st step: making the matching The example: motorways (BCN Spain-OSM) 18

2nd step: quality study Studying the quality: traditional parameters van Oort (2006)CompletenessLogical consistencyPositional accuracyAttribute accuracyTemporal qualitySemantic Accuracy Usage, purpose and constraints LineageVariation in QualityMeta-qualityResolution (≈ scale)Haklay (2008)CompletenessLogical consistencyPositional accuracyAttribute accuracyTemporal qualitySemantic Accuracy Usage, purpose and constraints LineageISO (2011)CompletenessLogical consistencyPositional accuracyThematic accuracyTemporal quality Usability elementLineage (19115) 19

2nd step: quality study Studying the quality: (some) crowd quality parameters Maué (2007). PGIS Reputation of contributors Information assymetry Haklay (2008) Longevity of engagement Number of editions on a feature Number of contributors on a feature Number of bugs fixedvan Exel (2010) User quality Local knowledege Experience Recognition Feature related quality Lineage Possitional accuracy Semantic accuracy Others Lineage Homogeneity in Quality Time between editions on a feature 20

2nd step: quality study Some methods to measure traditional quality (pos. accuracy) Possitional accuracy Interpretation of epsilon band Perkal (1966) Possitional accuracy Complete data sets are needed A higher quality dataset is needed Goodchild and Hunter (1997) Possitional accuracy (OS-OSM) Complete data sets are needed. He completed OSM Suposed OS is higher quality than OSM Haklay (2008) No complete data (and nobody is going to complete). Neither BCN nor OSM Don´t know which data set is better (OSM to update BCN) BCNSpain- OSM Buffer width: Until blue is totally inside orange Higher quality Lower quality Buffer width: Until blue is % inside orange Buffer width: Could be impossible to achieve 90-95% Buffer width: Two buffers. Compare de overlap areas 21

2nd step: quality study Example: measures of positional accuracy on motorways BCN Spain OSM 22

2nd step: quality study Example: measures of positional accuracy on motorways Buffer width (m) % of BCN roads within the OSM buffer A 500 m buffer around OSM is needed to reach 80% of the BCN length within the buffer= lack of completeness in OSM dataset BCN Scale 1/200k (buffer must be ≈ 20m, which means 73% of the length wihtin the OSM buffer) 23

2nd step: quality study Example: measures of positional accuracy on motorways Buffer width (m) % of OSM roads within the BCN buffer A 25m buffer around BCN is needed to reach 90% of the OSM length within the buffer. In this case the method works because every OSM motorways are also in BCN dataset. 24

2nd step: quality study Some methods to measure traditional quality (completeness) Based on boundary box on each feature 300 m radius to find candidates to match Additionally, levenshtein distance (streets) A higher quality data set is needed to compare lchairs/ OSL Musical Chairs Algorithm Not useful for motorways or long features. Useful for streets or polygons Convex hull could be used instead Bbox BCNSpain- OSM If the Bbox matches, then the street name is compared Higher quality Lower quality 25

2nd step: quality study Conclusions about traditional quality Complete data set (not a measure of completeness) is needed to measure positional accuracy Which parameter comes before? Completeness or Possitional accuracy Congrats! It is been proved that OSM is not complete OSM not as features to take, but as indicators to use It doesn´t matter if OSM is not complete It brings me to the first statement “Updating gaps”: which include the lack of completeness of OSM A new approach 26

Updates AddDeleteModify GeometryAttributes 3rd step: purpose updating process Traditional classification of updates 27

ROAD_G (official)= ROAD_G (OSM) YES ROAD_ATT (offic) = ROAD_ATT (OSM) YES Don´t need to be updated NO Updating gap, type I Attribute updating NO ROAD_G (official)= OTHER_G (OSM) YES Updating gap, type II Classification updating NO Doesn´t exist in OSM Updating gap, type III OSM can´t be used, but adviced. Doesn´t exist in official dataset Updating gap, type IV Automatically updating from OSM? 3rd step: purpose updating process Proposed classification of updates 28 Official data OSM data

The result Madrid case (1/200k) Time to update: 1,5 weeks, 1 person Features edited percentage: 30% Time saved: 3,5 weeks Costs saved: 40% 29

Find the best method to compare both data sets and try it in different data sets (based on TQ and CQ) Obtaining automatically different types of updating gaps. Look for a better way to compare data models (not manually) Try an automatic method to update the updating gaps based on OSM. Next steps 30

Thank you! Dank je wel! Gracias! 31