Download presentation
Presentation is loading. Please wait.
1
Data collection and extraction
Session 3.5: Implementing the geospatial data management cycle (Part 4): Data collection and extraction MODULE 3: GEOSPATIAL DATA MANAGEMENT Session 3.5: Implementing the geospatial data management cycle (Part 4): Data collection and extraction To better understand this session, particularly the data collection section, you are encouraged to read Health GeoLab Collaborative Guidance Document Part Collecting data in the field:
2
Key terms used in this session
Data: Facts and statistics collected for reference or analysis Digitizing: In GIS, the process of converting geographic data either from a hardcopy or a scanned paper maps or a remote sensing image into vector data by tracing the features. Geographic data: Information describing the location and attributes of things, including their shapes and representation. Geographic data is the composite of spatial data and attribute data. Here are the key terms that will be used in this session. 2
3
Key terms used in this session
Geospatial data: Also referred to as spatial data, information about the locations and shapes of geographic features and the relationships between them, usually stored as coordinates and topology. Statistical data: Also attribute data. Nonspatial information about a geographic feature, usually stored in a table that can be attached to a geographic object through the use of unique identifier or ID Here are the key terms that will be used in this session. 3
4
Filling identified data gaps
The data gaps identified after assessing the compiled data can be filled by creating geographic data. Geographic data = geospatial data + attributes When it comes to the geospatial data (objects), you can fill the gaps by: Using remote sensors placed on a satellite or in a plane (remote sensing) Processing the images resulting from remote sensing (image processing) Extracting the geographic coordinates of point type object or the topology of line/polygon objects from 1 or 2 (digitizing) Collecting the geographic coordinates of places in the field using a GNSS-enabled device (Surveying) Filling identified data gaps The gaps you have identified in the compiled data after the assessment process canbe addressed by creating the geographic data needed. Geographic data is the combination of geospatial data and attributes (statistical data). When it comes to the geospatial data (objects), you can fill the gaps by: Using remote sensors placed on a satellite or in a plane (remote sensing) Processing the images resulting from remote sensing (image processing) Extracting the geographic coordinates of point type object or the topology of line/polygon objects from 1 or 2 (digitizing) Collecting the geographic coordinates of places in the field using a GNSS- enabled device (Surveying) Methods 3 and 4 are the the ones you will most likely be using to address the data gaps. The methods you will most likely be using 4
5
Collecting or extracting geospatial data
This session focuses on the two methods that you will most likely use in filling the identified data gaps. Collecting or extracting geospatial data Collecting or extracting geospatial data (Refer to slide) 5
6
Collecting or extracting geospatial data
When creating geographic data: It is pertinent that the geographic objects comply with the defined data set specification: Scale of work Accuracy and precision level for vector format geospatial data Accuracy and resolution for raster format geospatial data Collecting or extracting geospatial data The geographic data you create must comply with the defined data set specification: 1. Scale of work – the map scale that corresponds to the need of the project as specified in the data set specification 2. Accuracy and precision level for vector format geospatial data – these corresponds to the chosen scale of work of the project 3. Accuracy and resolution for raster format geospatial data – similar to no. 2, this also corresponds to the chosen scale of work of the project (Scale, accuracy, precision, and resolution were discussed in Session 3.3) 6
7
Collecting or extracting geospatial data
When it comes to attribute data (statistics/information): The attribute must be attached to its corresponding object, preferably through the use of its unique ID coming from the master list. Geospatial data Master list Attributes Collecting or extracting geospatial data When collecting attribute data remember that it will be attached to its corresponding object, preferably through the use of its unique ID coming from the master list. District ID Stats/Info 7
8
Collecting or extracting geospatial data
When it comes to attribute data (statistics/information): It is therefore key that any field survey aiming at collecting health related data contains the necessary fields to link with the corresponding master list and its unique IDs! Collecting or extracting geospatial data It is therefore key that any field survey aiming at collecting health-related data contains the necessary fields to link with the corresponding master list and its unique IDs! In the images you see in the slide, the image on the left is the old data collection (survey) form used to collect data on health facilities. Notice that the unique ID from the master list is not captured (and that the other information such as the address (province, municipality,a nd barangay) was being captured together in oen question and not separately). The form on the right, the unique ID from the master list is captured (and each needed information is captured separately which will correspond to separate fields in a a table). 8
9
Extracting geospatial data
When data gaps are to be addressed by extracting geospatial data, it can be done by digitizing: Remote sensing images Paper maps Extracting geospatial data When data gaps are to be addressed by extracting geospatial data, it can be done by digitizing: Remote sensing images Paper maps This method will only work if remote sensing images and/or paper maps that correspond to the data gaps and comply to the data set specification are available. Check back with the different sources of data and ascertain if they have remote sensing images and/or paper maps you may use to extract data from. Otherwise, you will have to disregard this step and determine if you have the time and resources to instead collect the data in the field. This method will only work if remote sensing images and/or paper maps that correspond to the data gaps and comply to the data set specification are available. 9
10
Extracting geospatial data
Digitizing in GIS is the process of converting geographic data either from a hardcopy or a scanned paper maps or a remote sensing image into vector data by tracing the features. There are two types of digitizing: Manual digitizing Automatic digitizing Extracting geospatial data Digitizing in GIS is the process of converting geographic data either from a hardcopy or a scanned image into vector data by tracing the features. During the digitizing process, features from the traced map or image are captured as coordinates in either point, line, or polygon format. There are two types of digitizing: Manual digitizing Automatic digitizing 10
11
Extracting geospatial data
Manual Digitizing is the human-guided capture from a map image or source. It can be done by: On-screen digitizing – source scanned or downloaded into computer software. Mouse and keyboard are used to digitize features. Most common method of digitizing today. Hardcopy digitizing – hardcopy source taped to digitizing table connected to a computer. Digitizing puck used to digitize features that feeds coordinates and codes into computer. Extracting geospatial data Manual digitizing Manual Digitizing is the process of digitizing wherein a human identifies the features in a paper map or remote sensing image that needs to be converted into vector data. This is done via on-screen digitizing or hardcopy digitizing. (Refer to slide) 11
12
Extracting geospatial data
Automatic Digitizing is the computer-guided capture from a map image or source. It is done by uploading a scanned map or remote sensing image into an image processing software that uses pattern recognition technology to generate vector data. This process may require cleaning of the resulting digitized data as the algorithm used by the software sometimes make mistakes in recognizing features. Extracting geospatial data Automatic digitizing Automatic digitizing is the process of digitizing wherein a scanned map or remote sensing image is uploaded into an image processing software that uses pattern recognition technology to generate vector data. The software uses algorithm that will allow it to recognize the different features from your source and convert them to vector data. However, the algorithm may sometimes misrecognize features resulting in errors in the resulting vector data. These errors would have to be corrected before being able to use the data. (Refer to slide) 12
13
Extracting geospatial data
No matter which digitizing process you choose in extracting geospatial data, it is important to properly fill out the attribute table of all data particularly the unique code that would connect each record to the master lists. Doing this will allow the statistical data to be joined with the geospatial data in order to create maps. Extracting geospatial data Types of Digitizing in GIS There are several types of digitizing methods. Manual digitizing involves tracing geographic features from an external digitizing tablet using a puck (a type of mouse specialized for tracing and capturing geographic features from the tablet). Heads up digitizing (also referred to as on-screen digitizing) is the method of tracing geographic features from another dataset (usually an aerial, satellite image, or scanned image of a map) directly on the computer screen. Automated digitizing involves using image processing software that contains pattern recognition technology to generate vector. 13
14
Collecting geospatial data
When data gaps are to be addressed by collecting data in the field, it is important to have the necessary elements in place before, during, and after the data collection exercise. This is to ensure not only the cost effectiveness of the activity but the compliance of the collected data with the objectives and expected outcomes of the project as well as the pre-defined specification and ground reference. Collecting geospatial data (Refer to slide) 14
15
Collecting geospatial data
Let us first discuss field data collection exercises relevant to public health. There are two main types: Surveys for which the main purpose is the collection of health related data and/or information at different levels of disaggregation (e.g. household, health facility, village, administrative division). Exercises aiming at collecting the geographic coordinates of a specific object to which health data/information is attached. Collecting geospatial data (Refer to slide) This kind of exercise requires the data to be contextualized by geography and time in order to ensure its proper use in a Geographic Information System (GIS). In Public Health, this kind of exercise mainly concerns mobile or fixed point type objects and requires the use of a GNSS-enabled device. It is nevertheless important to note that in some cases, the absence of a geographic location for the object being surveyed might require the survey instrument to also contain a section for collecting this location. 15
16
Collecting geospatial data
As mentioned in the previous slide, health-related data and/or information collected through surveys must be contextualized from both a geographic and time perspective in order to ensure their proper use in a GIS. This means that each piece of data/information being collected should: Relate to the object it is attached to (e.g. person, patient, health facility, village, administrative division) and the geography of this object is known Have a specific time stamp attached to it. Collecting geospatial data Relate to the object it is attached to (e.g. person, patient, health facility, village, administrative division) and the geography of this object is known -- The relation between the data/information and the corresponding object is ensured by capturing the unique identifier (code) and name from the official master list directly in the survey instrument. Have a specific time stamp attached to it. -- The time stamp of the data is generally that of the survey implementation date. Capturing the time dimension is important from a geographic perspective the geography may change over time. Each collected data or information needs to relate to the geography as it was observed at that particular time. It might be that some questions in the survey make reference to past events and the date of those past events should therefore be captured accordingly. 16
17
Collecting geospatial data
As field data collection can be costly and time consuming, it is important to determine the data collection technology and method that would best fit the objectives and expected outcomes of the project particularly in terms of Scalability Accuracy Collecting geospatial data Scalability: Cost (manpower, equipment and associated material). While varying from one country to another based on the cost of the equipment and the level of effort required to implement each option, including staff cost, the relative costs between options remain generally the same Complexity of the process to collect the coordinates Accuracy: Capability to obtain geographic coordinates with an accuracy of 15 m through two indicators: number of satellite signals received and direct instrumental accuracy measure Possibility to perform a check on the coordinates while still in the field 17
18
Collecting geospatial data
Here are the different data collection options covering range of technologies and methods: Collecting geospatial data Different combinations of technologies and methods exist nowadays for collecting geographic coordinates in the field. These combinations do not necessarily ensure the same level of accuracy and some of them require more resources and time to be implemented and are therefore less scalable than others. This table presents twelve options covering a range of technologies and methods that can be used in collecting data in the field. The options leading to the lowest level of accuracy, comparatively speaking, are located on the left side of the table (highlighted in orange) and those leading to the highest level of accuracy are on the right (highlighted in dark green). Please note that options presented in this table is not exhaustive. 18
19
Collecting geospatial data
When selecting an option from the table, it is important to remember that: These options consider that there is no pre-existing electronic system in place for data collection in the country. If such system is already in place then using it might be more scalable than implementing a new one. The use of an electronic form, with predefined content when applicable, can considerably reduce data collection time as well as data entry errors. The use of such form should therefore be preferred over paper ones. Collecting geospatial data When selecting an option from the table, it is important to remember that: (Refer to slide) 1. It is important to determine first if an electronic system for data collection is already in place in the country and if that system would work for the needs of the project. A compatible system would save you time instead of implementing a new one. 2. Pre-defined content refers to dropdown lists/menus with which the data collector just have to choose one of the available options and do not have to type in the answer. This saves time as well as reduce data entry errors. (Continued on next slide) 19
20
Collecting geospatial data
When selecting an option from the table, it is important to remember that: The higher the accuracy of the collected coordinates, the larger the number of potential uses of the collected geographic coordinates due to the direct relation that exists between accuracy and geographic scale. It is therefore recommended to always aim at reaching the highest accuracy level possible. Collecting geospatial data (Continued from previous slide) When selecting an option from the table, it is important to remember that: (Refer to slide) 3. This relates back to the topic of scale and accuracy in Session 3.3. If time and resources permit, collect data at the highest accuracy possible. The idea is to collect once, use may times. In doing so, you are maximizing the potential use of the data and saving time and resources down the road. This will allow the data to be available for use for small scale and large scale maps. 20
21
Collecting geospatial data
When selecting an option from the table, it is important to remember that: Collecting geographic coordinates through the unique use of offline map applications (options 1 to 3) requires data collectors to know how to navigate on a map and be able to identify the location where they are and the geography of the area from an above/aerial view. This method can also be applied remotely with the help of people knowledgeable about the area being covered. Collecting geospatial data (Continued from previous slide) When selecting an option from the table, it is important to remember that: (Refer to slide) 4. Options 1-3 require data collectors to know how to navigate on a map and be able to identify the location where they are and the geography of the area from an above/aerial view. This could be easy or hard depending on the knowledge of the data collectors on the area and also the presence or absence of landmarks in the area being mapped. It is recommended to use both this method and the minimum/maximum expected coordinates described in Section in the guidance document to be able to determine if the object being mapped falls within or without the administrative division it is supposed to be in (This is discussed in later slides). 21
22
Collecting geospatial data
To be able to implement any one of these options, a certain set of elements is necessary. The following are required for all of the options: Data collection form A Standard Operating Procedure (SOP) Collecting geospatial data Data collection form: set of fields used to capture the geographic component of the survey. This may be paper or electronic. A Standard Operating Procedure (SOP): a document describing the steps to be followed by the data collector to implement the selected option in the field 22
23
Collecting geospatial data
Data collection form The proper collection of the geographic coordinates attached to a particular object requires the use of a specific set of fields that should include: The official name and unique identifier (code) of the object in question taken from an official master list; The address and location of the object expressed as the official name and unique identifier code of the administrative division in which the object is located; and The geographic coordinates of the objects as well as the fields used for assessment of the quality of these coordinates. Collecting geospatial data Data collection form The content of the data collection form will vary based on the option that is being implemented from the Table in slide 17 (Table 1 in the guidance document). Generic versions of the different forms recommended for each option are provided in Annex 1 of HGLC Guidance document volume 23
24
Collecting geospatial data
Standard Operating Procedure (SOP) An SOP document is a document that: contains a checklist of the steps for the entire data collection and verification process is critical to effectively collect the geographic coordinates of a specific object in the field that should be provided to the data collectors. focuses on providing only the essential information, being as clear and simple as possible, covering all the fields in the form, and includes illustrations that will help the data collector complete each step. Collecting geospatial data Standard Operating Procedure (SOP) Based on the above, an SOP for Options 1 – 12 will be unique to the specific data collection and verification requirements of each option. Such an SOP might also need to be adjusted for the specific device being used. The SOPs for the all the options are available in the guidance document. 24
25
Collecting geospatial data
Master list It is also highly recommends that any data collection exercise be based on the use of an official, complete, up-to-date, and uniquely coded lists (master lists) for the: Object: generally an infrastructure (e.g. health facility, school, household) Administrative divisions: for the country, or part of the country in which the objects in question are located Collecting geospatial data Master list The concept of master lists has already been discussed in the previous sessions. This presents a concrete example of how and why the master list should always be used. 25
26
Collecting geospatial data
There are also some elements that are only required for some of the options: Administrative boundaries geospatial data Geographic Information System (GIS) GNSS-enabled device Offline map application Maximum-Minimum expected Purpose-designed data collection application Collecting geospatial data Administrative boundaries geospatial data - the boundaries of the administrative divisions captured in a GIS format are needed in order to create the minimum/maximum coordinates document Geographic Information System (GIS) software - To implement options 1 to 12 presented in Table 1, GIS software needs to be able to: Display vector format layers (polygons and points). Join an Excel file and the attribute table of the map data uploaded in the software. Calculate the minimum and maximum latitudes and longitudes of a set of polygons. GNSS-enabled device - contain a receiver/antenna to connect to a Global Navigation Satellite System (GNSS) and use this to collect geographic coordinates. The topic of GNSS is further discussed in Module 4. Offline map application - Certain offline mapping applications are able to collect geographic coordinates even when the device does not have a GNSS receiver. However, it is still required (options 7 to 12) that the device can connect to the internet prior to conducting the survey so that the map that will be used in the field can be downloaded to the device. Maximum-Minimum expected coordinates - A simple way to ensure that the geographic coordinates have been collected correctly while still in the field is to ensure that the collected coordinates fall between the minimum (Min) and maximum (Max) latitude and longitude of the administrative division in which the object is located. A specific document containing the Min/ Max coordinates should therefore be prepared before the data collection exercise and given to the data collectors in order for them to perform this check while collecting coordinate data in the field. Purpose-designed data collection application – Preferable to use as this reduces risk of transcription errors. Another advantage of such application is that part of the information to be entered can be standardized through the use of dropdown menus. Examples of this are GeoODK and Survey 123 26
27
Collecting geospatial data
Going back to the actual implementation of field data collection, once it has been decided that this method should be conducted to address the data gap, there are several steps that need to be completed before, during, and after the data collection takes place in the field. Before data collection, two primary actions must be considered: Preparing the material needed to implement the selected Option. Selection and training of the data collectors and their supervisor(s) Collecting geospatial data Before data collection The preparation of the material must cover the: Selection of the option that will be implemented (from Table 1) Adjustment of the form to match the selected option (to be found in Annex 1 of the HGLC guidance document volume 2.4.2) Creation of the electronic form, if required; Preparation of the associated annexes Development of the SOP to be implemented in the field Installation of the applications on the devices, if required Verification that the data collection devices are operational and configured to meet the required specifications Preparation of the training material. Selection and training of the data collectors and their supervisor(s): Knowledge and experience in using GNSS-enabled devices should be taken into account when selecting data collectors and their supervisor(s). In addition to selecting qualified data collectors, training data collectors and their supervisor(s) is one of the, if not the most, important steps in the pre-survey process. The training should aim to provide a good understanding and proper use of the different documents, the data collection equipment, and SOPs. The training should also cover appropriate troubleshooting methods regarding commonly encountered issues while in the field. 27
28
Collecting geospatial data
During the survey, the data collectors should follow the SOP using the associated documents that will have been provided to them. It is also important to verify the methods used to collect the geographic location information and address any unexpected issues while the data collectors are in the field. The following verification steps should be followed depending on available resources and the extent of the surveyed area: On-site spot checks of data accuracy and completeness conducted by the data collection supervisor. Verification of the data remotely through periodic submissions of the collected data in a spreadsheet Collecting geospatial data During the survey The following verification steps should be followed depending on available resources and the extent of the surveyed area: On-site spot checks of data accuracy and completeness conducted by the data collection supervisor. Part of the check for accuracy can be performed using Survey123 for ArcGIS, Maps.Me, or Google Maps as way to ensure that the point is falling in the expected area. To complete on-site spot checks, the supervisor should ensure that all the fields of the geographic component of the survey are entered correctly and that the GPS coordinates fall within the min/max expected ranges. Verification of the data remotely through periodic submissions of the collected data in a spreadsheet, either using Google Sheets online or ing a Microsoft Excel spreadsheet to the data collection supervisor. 28
29
Collecting geospatial data
Once the survey is completed, and if not already done as part of the data verification process implemented during data collection, it is important to ensure that the collected data is organized in a structured table that can be saved or exported as a spreadsheet, e.g. in Microsoft Excel. If the data is collected in an electronic form, the data should be available for export in a similar structure. Such a dataset should contain all the fields that are on the data collection form. Collecting geospatial data After the survey It is important to ensure that the collected data is organized in a structured table that can be saved or exported as a spreadsheet, e.g. in Microsoft Excel. Annex 14 in the guidance document is an example of this. The headers of each field can be simplified as reported in Annex 14. When doing so, it is important to remember that some software cannot handle more than a certain number of characters or accept certain characters e.g. spaces. This is the case, for example, with ArcGIS which will only keep 8 characters when importing such a table and has difficulties handling spaces. The labels reported in Annex 14 take this into account. 29
30
Collecting geospatial data
In order to simplify the work of the user, it is always preferable for the spreadsheet containing the dataset to be accompanied by two additional sheets containing: The data catalog, i.e. the definition of the content for each field in the dataset The metadata associated with the data set. Collecting geospatial data After the survey This refers to the documenting the data which is discussed further in the succeeding session. Sample data catalog 30
31
Data cleaning, validation, and documentation
What should you now do after extracting and/or collecting data to fill the gaps? Session Implementing the geospatial data management cycle (Part 5): Data cleaning, validation, and documentation What should you now do after extracting and/or collecting data to fill the gaps? The next session discusses how to clean, validate, and document the data. 31
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.