Presentation is loading. Please wait.

Presentation is loading. Please wait.

OFFICIAL AND CROWDSOURCED GEOSPATIAL DATA INTEGRATION Searching solutions to improve the processes in cartography updating By Jimena Martínez Supervisors:

Similar presentations


Presentation on theme: "OFFICIAL AND CROWDSOURCED GEOSPATIAL DATA INTEGRATION Searching solutions to improve the processes in cartography updating By Jimena Martínez Supervisors:"— Presentation transcript:

1 OFFICIAL AND CROWDSOURCED GEOSPATIAL DATA INTEGRATION Searching solutions to improve the processes in cartography updating By Jimena Martínez Supervisors: Antonio Vázquez and Marianne de Vries

2 Table of contents BackgroundProblemsThe idea The steps to develop the idea, and an example to show it 2

3 BCN200 National (Spain) Provinces 1/200.000 2 years 300.000 € BTN25 National (Spain) Sheets 1/25.000 4 years 3.500.000 € BTA5 Local (Spanish provinces) Sheets 1/5.000 4 years 800.000 € MGCP International (Africa, Middle East) Cells (208 Spain: 6 countries) 1/50.000 4 years Spanish budget: 27.000.000 € Background Scope Cartography units ScaleUpdating cycleBudget 3

4 Problems 1. Why official cartography is never enough updated? Satellite/ aerial images collecting date 1st real change Dec. 2010 Feb. 2011 May 2011 Dec. 2011 Release date (2011 version) 2nd real change Update process Off. Data reflects 1st change 4

5 Problems 2. Why updating process is such long and expensive? Traditional updating process Vector cartography from last year. Set of data sources against which compare the cartography (images, maps, raster, vector) Reviewing the whole cartography unit. Too much time to review, not much time to edit features. 5

6 Problems 2. Why updating process is such long and expensive? Madrid case (1/200k) Time to update: 4 weeks 1 person Features edited percentage: 30% Time to edit this features: 1,5 weeks Would be possible to save the other 3,5 weeks? 6

7 1. Why official cartography is never enough updated? Traditional process based on different data sources. Data sources have different dates (collecting dates) 2. Why updating process is such long and expensive? Reviewing the whole cartography against different data sources is needed… to detect changes.As a result:Long processExpensive process Not always useful result (if highly updated cartography is needed) Problems 7

8 To develop a general methodology to decide whether crowdsourced (OpenStreetMap) and official geodata could be integrated or not in order to use OSM to improve the official cartography updating process. A system that finds where the official dataset need to be updated, and which type of update needs each feature, without reviewing the whole cartography unit. Saving costs and obtaining better updated cartography The idea 8

9 OSM data Official data Better updated features (not always) Vector formatNot completeNot homogeneous OSM to indicate where to update Data sources in the updating process 9

10 NMAs official data Crowdsourced data (OSM) Government (NMA) Users/NMAs/companies MAP v.1. MAP v.1…v.n Tenders (companies) MAP v.2 Updating processes Updating & production processes Months/ Years Hours/days The idea Differences in updating processes 10

11 Satellite/ aerial images collecting date OSM update Dec. 2010 Jan. 2011Feb. 2011 May 2011 Dec. 2011 Release date (2011 version) June 2011 OSM update Update process Off. Data reflects 1st change The idea 1st real change OSM reflects 1st change OSM reflects 2nd change Differences in updating processes 2nd real change 11

12 Which one is better? OSM data Official data OSM data The idea Which dataset is “better”? Some studies ( Haklay 2008, Zielstra&Zipf, 2010 ) take this data set as the “truth” against which to compare OSM As a result OSM is not 100% complete But, what happens with that? The desired result will be: Types of updates Differences 12

13 HOW to integrate OSM and official data? Matching data models in a reference semantic supra model (INSPIRE?- ontologies?) Quality indicators (traditional and Crowd quality parameters) WHAT features from OSM? OSM not as features to take, but as indicators to use. If not useful, not used: types of updates. WHY OSM? Amount of data.Updated data. Accuracy data (Linus Law). Comparative studies AIM 1AIM 2 The idea Questions to answer AIM 3 “Given enough eyeballs, all bugs are shallow” 13

14 Matching process (feature classes filter) Official data set Feature class 1 Feature class 2... Feature class n Reference semantic model (INSPIRE?) INPUT specifications Feature class 1 (50) Feature class 2 (80)... Feature class n (N) Specifications WEB Candidates OSM data set QC and QA (features filter) Feature class 1 (30)... Feature class n (N-M) Feature class 1 Feature class 2... Feature class n Feature class 1 (50) Feature class 2 (80)... Feature class n (N) Updating process “Updating gaps” Types of updates ISO 19157 Crowd Quality VGI teams/ Online updating Update OSM The idea: the proposed system 14

15 The steps to reach the goal And an example to show them Making the matching between data models and features. 1 To study Quality parameters to decide which features could be used. 2 Proposing a new updating process based on flags and types of updates. 3 15

16 NMA data modelOSM data model FormatDatabase, shpXML (.osm) (Geometric) Primitives Node Arc Face Node Way Tag Relations Feature classTable, filePrimary tag (key) Feature (each object)RowPrimary tag (value) AttributeColumnTag (key) Values (domains)CellsTag (value) 1st step: making the matching Comparing data models 16

17 1st step: making the matching An approach (based on H. Uitemark) Real world 2. OpenStreetMap 1. Official dataset A2: building, church B2: building, school C2, D2: highway, motorway A1: building of interest C1: motorway D1: toll motorway A D A1 D1 A2 D2 B B2 Candidates: {[(A1,A2), (A1,B2)], [(C1,C2), (C1,D2)], [(D1,C2), (D1,D2)]} Legend C C2 C1 17

18 1st step: making the matching The example: motorways (BCN Spain-OSM) 18

19 2nd step: quality study Studying the quality: traditional parameters van Oort (2006)CompletenessLogical consistencyPositional accuracyAttribute accuracyTemporal qualitySemantic Accuracy Usage, purpose and constraints LineageVariation in QualityMeta-qualityResolution (≈ scale)Haklay (2008)CompletenessLogical consistencyPositional accuracyAttribute accuracyTemporal qualitySemantic Accuracy Usage, purpose and constraints LineageISO 19157 (2011)CompletenessLogical consistencyPositional accuracyThematic accuracyTemporal quality Usability elementLineage (19115) 19

20 2nd step: quality study Studying the quality: (some) crowd quality parameters Maué (2007). PGIS Reputation of contributors Information assymetry Haklay (2008) Longevity of engagement Number of editions on a feature Number of contributors on a feature Number of bugs fixedvan Exel (2010) User quality Local knowledege Experience Recognition Feature related quality Lineage Possitional accuracy Semantic accuracy Others Lineage Homogeneity in Quality Time between editions on a feature 20

21 2nd step: quality study Some methods to measure traditional quality (pos. accuracy) Possitional accuracy Interpretation of epsilon band Perkal (1966) Possitional accuracy Complete data sets are needed A higher quality dataset is needed Goodchild and Hunter (1997) Possitional accuracy (OS-OSM) Complete data sets are needed. He completed OSM Suposed OS is higher quality than OSM Haklay (2008) No complete data (and nobody is going to complete). Neither BCN nor OSM Don´t know which data set is better (OSM to update BCN) BCNSpain- OSM Buffer width: Until blue is totally inside orange Higher quality Lower quality Buffer width: Until blue is 90- 95% inside orange Buffer width: Could be impossible to achieve 90-95% Buffer width: Two buffers. Compare de overlap areas 21

22 2nd step: quality study Example: measures of positional accuracy on motorways BCN Spain OSM 22

23 2nd step: quality study Example: measures of positional accuracy on motorways Buffer width (m) % of BCN roads within the OSM buffer A 500 m buffer around OSM is needed to reach 80% of the BCN length within the buffer= lack of completeness in OSM dataset BCN Scale 1/200k (buffer must be ≈ 20m, which means 73% of the length wihtin the OSM buffer) 23

24 2nd step: quality study Example: measures of positional accuracy on motorways Buffer width (m) % of OSM roads within the BCN buffer A 25m buffer around BCN is needed to reach 90% of the OSM length within the buffer. In this case the method works because every OSM motorways are also in BCN dataset. 24

25 2nd step: quality study Some methods to measure traditional quality (completeness) Based on boundary box on each feature 300 m radius to find candidates to match Additionally, levenshtein distance (streets) A higher quality data set is needed to compare http://humanleg.org.uk/code/oslmusica lchairs/ OSL Musical Chairs Algorithm Not useful for motorways or long features. Useful for streets or polygons Convex hull could be used instead Bbox BCNSpain- OSM If the Bbox matches, then the street name is compared Higher quality Lower quality 25

26 2nd step: quality study Conclusions about traditional quality Complete data set (not a measure of completeness) is needed to measure positional accuracy Which parameter comes before? Completeness or Possitional accuracy Congrats! It is been proved that OSM is not complete OSM not as features to take, but as indicators to use It doesn´t matter if OSM is not complete It brings me to the first statement “Updating gaps”: which include the lack of completeness of OSM A new approach 26

27 Updates AddDeleteModify GeometryAttributes 3rd step: purpose updating process Traditional classification of updates 27

28 ROAD_G (official)= ROAD_G (OSM) YES ROAD_ATT (offic) = ROAD_ATT (OSM) YES Don´t need to be updated NO Updating gap, type I Attribute updating NO ROAD_G (official)= OTHER_G (OSM) YES Updating gap, type II Classification updating NO Doesn´t exist in OSM Updating gap, type III OSM can´t be used, but adviced. Doesn´t exist in official dataset Updating gap, type IV Automatically updating from OSM? 3rd step: purpose updating process Proposed classification of updates 28 Official data OSM data

29 The result Madrid case (1/200k) Time to update: 1,5 weeks, 1 person Features edited percentage: 30% Time saved: 3,5 weeks Costs saved: 40% 29

30 Find the best method to compare both data sets and try it in different data sets (based on TQ and CQ) Obtaining automatically different types of updating gaps. Look for a better way to compare data models (not manually) Try an automatic method to update the updating gaps based on OSM. Next steps 30

31 Thank you! Dank je wel! Gracias! 31 J.MartinezRamos@student.tudelft.nl


Download ppt "OFFICIAL AND CROWDSOURCED GEOSPATIAL DATA INTEGRATION Searching solutions to improve the processes in cartography updating By Jimena Martínez Supervisors:"

Similar presentations


Ads by Google