Presentation is loading. Please wait.

Presentation is loading. Please wait.

GIS Fundamentals/ Geographic Database Design

Similar presentations


Presentation on theme: "GIS Fundamentals/ Geographic Database Design"— Presentation transcript:

1 GIS Fundamentals/ Geographic Database Design

2 GIS Concepts Information cycle:
Data/Information/System/Information System Geographic Information System Main Components/Characteristics Geographic Database Data Modeling Data Representation Spatial Analysis Implementing a GIS

3 Information Cycle Territory Data GIS DSS Information Decision

4 Data / Information Information is the result of interpretation of relations existing between a certain number of single elements (called data). Example: The Museum located at 5th Avenue, NY, was built in 1898. Data: Museum, address, year of construction.

5 System A system is a set organized globally and comprising elements which coordinate for working towards doing a result. Example: Water supply system Elements: pipes, valves, hydrants, water meters, pumps, reservoirs, etc.

6 Information System (IS)
An Information System is a set organized globally and comprising elements (data, equipment, procedures, users) that coordinate for working towards doing a result (information).

7 GIS: “G” & “IS” Definition: A GIS is a collection of computer hardware and software, geographic data, methods, and personnel assembled to capture, store, analyze and display geographically referenced information in order to resolve complex problems of management and planning. A GIS is an IS with geographic data as input and geographic information as output Geographic Data? Geographic Information?

8 Components of a GIS

9 GIS Input Output User Interface Models Other GIS Reports Maps
Geographic Data Geographic Information Input GIS Output Reports Maps Photo. Products Statistics Input Data for models Manipulation Analysis Maps Census Field Data RS Data Others Data Capture Display Storage GIS Components User Interface Models Other GIS

10 GIS: Main Characteristics
Integration of Multiple data: - Sources - Scales - Formats Geographic Database Spatial Analysis

11 Data from multiple sources-at multiple scales-in multiple formats
Census/ Tabular data Maps Picture & Multimedia GPS/ air photos/ satellite images

12 Referencing map features: Coordinate systems & map projections
To integrate geographic data from many different sources, we need to use a consistent spatial referencing system for all data sets

13 The Latitude/Longitude reference system
latitude φ : angle from the equator to the parallel longitude λ : angle from Greenwich meridian

14 Map Projections Curved surface of the earth needs to be “flattened” to be presented on a map: Map Projection Projections are classified according to which properties they preserve: area, shape, angles, distance Some distortion is inevitable: Less distortion if maps show only small areas, but large if the entire earth is shown Projection is the method by which the curved surface is converted into a flat representation. Projection is the method by which the curved surface is converted into a flat representation

15 UTM: Universal Transverse Mercator
Minimal distortions of area, angles, distance and shape at large and medium scales Very popular for large and medium scale mapping (e.g., topographic maps) Cylindrical projection with a central meridian that is specific to a standard UTM zone 60 zones around the world One cartographic reference system that desreves more detailed discussion is the UTM system

16 Space as an indexing system

17 The concept of scale Scale is the ratio between distances on a map and the corresponding distances on the earth’s surface e.g., a scale of 1:100,000 means that 1 cm on the map corresponds to 100,000 cm or 1 km in the real world Small scale: small fraction such as 1:10,000,000 shows only large features Large scale: large fraction such as 1:25,000 shows great detail for a small area “small scale” vs “large scale” often confused

18 Multi-scale The same feature represented in different scales.
Example: lake large scale map of 1:25,000 may show individual houses smaller scale map of 1:500,000 shows only points representing villages Large scale (1:25.000) Small scale 1:

19 Multi-formats Raster Vector Raster-Vector-Raster DXF-DGN-etc. Shapefile KML Etc.

20 Geographic Database Geographic Data Characteristics/Examples
Definitions: Entity/Attribute/Dataset/Database Data Modeling Spatial representation Vector/Raster Topology

21 Descriptive Data vs Geographic Data
Descriptive attributes Geographic Data: Spatial attributes Location Form Geographic location is the element that distinguishes geographic information from all other other types.

22 Geographic Data Characteristics :
Position: explicit geographic reference Cartesian coordinates :X,Y,Z Geographic coordinates (lat, log) implicit geographic reference Address Place-name Etc. Geometric Form: ex: a polygon representing a parcel of land Spatial data describes the location and shape of geographic features, and their relationship to other features, and Descriptive data which characterizes the geographic features.

23 Example1: Parcel of land
Attribute (descriptive) Data Landowner Area Etc. Spatial data Position Located at 100 Nelson Mandela Ave X= a; Y=b within system (X,Y) Form dimensions (sides and arcs, constituting a polygon)

24 Example 2: District Attribute (Descriptive) data: District-Code
District-Name Population 1990 Population 2000 Population 2010 Spatial data: Geographical Position Polygon

25 Spatial entity We use the term entity to refer to a phenomenon that can not be subdivided into like units. Example: a house is not divisible into houses, but can be split into rooms. Others: a lake, a statistical unit, a school, etc. In database management systems, the collection of objects that share the same attributes. An entity is referenced by a single identifier, perhaps a place-name, or just a code number Each spatial entity has one or more attributes that identify what the entity is, describe it, or represent some magnitude associated with the entity. Indeed, the type of analysis you plan to do depends on the type of attributes you are working with. Example: you can categorize roads by whether they are local roads, highways, etc; by their length; their width; their pavement; etc.

26 Attribute Each spatial entity has one or more attributes that identify what the entity is, and describe it. Example: you can categorize roads by whether they are local roads, highways, etc; by their length; their width; their pavement; etc. The type of analysis you plan to do depends on the type of attributes you are working with.

27 Dataset “A dataset is a single collection of values or objects without any particular requirement as to form of organization.” Example: Streets, rivers, cities, etc.

28 Geographic Database “A geographic database is a collection of spatial data and related descriptive data organized for efficient storage, manipulation and analysis by many users.” It supports all the different types of data that can be used by a GIS such as: Attribute tables Geographic features Satellite and aerial imagery Surface modeling data Survey measurements Fundamentally, a GIS is based on a structured database that describes the world in geographic terms.

29 Data Modeling Data Modeling is the process of defining (geographic features) to be included in the database, their attributes and relationships, and their internal representation in the Database. It involves the development of conceptual, logical and physical models of the geographic Database. The outcomes include a Data Dictionary

30 Modeling Process Reality Modeling Geographic Database (data & treat.)
Abstracting the Real World Reality Modeling (data & treat.) We have real world as percieved by users > And we would build a physical data model We insert an intermediate step: the CDM Geographic Database

31 ANSI/SPARC: Study Group on Data Base Management Systems (1975)
Different users have different views of the world “Real World” External Model 1 External Model 2 External Model 3 Conceptual Model Logical Model Physical Model

32 Conceptual Model A synthesis of all external models (user’s views).
Schematic representations of phenomena and how they are related. Information content of the database (not the physical storage) so that the same conceptual model may be appropriate for diverse physical implementations. Therefore, the conceptual model is independent from technology. The conceptual level corresponds to a synthesis of all external models. Although an abstraction of the real world, the conceptual model is consisting of schematic representations of phenomena and how they are related. The organization scheme created at this stage generally deals with only the information content of the database, not the physical storage, so that the same conceptual model may be appropriate for diverse physical implementations. Therefore, the conceptual model is independent from technology.

33 Conceptual Model (cont.)
Easy to read Conceived for the analyst or designer Objective representation of the reality, therefore independently from the selected GDB System One conceptual model for the Database

34 Data Logical Model & Physical Model
We transform the conceptual model into a new modeling level which is more computing oriented: the logical model (Example: the Relational Database approach) We transform the logical model into an internal model (physical model) which is concerned with the byte-level data structure of the database. Whereas the logical model is concerned with tables and data records, the physical model deals with storage devices, file structure, access methods, and locations of data.

35 Several types of data organization
Hierarchical model - Hierarchical relationships between data(parent- child) Network Model - Focus on connections (e.g. airline booking system) Relational model - Based on relations (tables)- True Relat. DBMS use SQL Object-Oriented model - Focus on Objects Hierarchical: (Obselete). Hierarchical model - Hierarchical relationships between data (parent-child) Network Model - Focus on connections (e.g. airline booking systems) - Adv.: fast, efficient; Disadv.: inflexible Relational model - Data is organized within Tables (files) and relationships expressed between tables and data elements - True Relational DBMS use the Structured Query Language (SQL) Object-Oriented model - Focus on Objects: efficient storage and retrieval - (Future + Web) Network Model: technically obselete

36 Entity-relationship Formalism
Entity name Attributes ENTITY_NAME1 -attribute 1 -attribute 2 ENTITY_NAME2 -attribute 1 -attribute 2 0-N 0-1 Identifier (key-attribute) Association (relationship) Maximum cardinality (indeterminable/any number) (0,N) refers to the cardinality of the relationship. It means that , for example, at a minimum Minimum cardinality (0,N) refers to the cardinality of the relationship

37 An example of land parcels

38 The E/R diagram for land parcels
STREET -name A B SEGMENT -number PARCEL -number 2-N 0-1 1-2 3-N 1-N 2-2 A: Streets have edges (segments) B: parcels have boundaries (segments) C: line have two endpoints D: parcels have owners, and people own land. C D 2-N 1-N POINT -number -x,y LANDOWNER -name -date-of-birth

39 Data Tables

40 Data Dictionary Definition:
A data catalog that describes the contents of a database. Information is listed about each field in the attribute table and about the format, definitions and structures of the attribute tables. A data dictionary is an essential component of metadata information.

41 Example Definition of entities
RAIL: way of communication and transportation Definition of attributes RAIL-ID: reference numbers for rail segments RAIL_CLASS: single track, double track, electrified, etc. RAIL_NAME: name for particular railway Explanations for measurements of attributes (type of attribute values) or coding practices RAIL-ID: INTEGER RAIL-NAME: CHARACTER, LONG=30

42 Sample components of a digital EA map

43 EA database entities Street Number Name --- Admin. Unit AU AU_Pop. ---
EA-code Area Pop. Buildings Number HHs Etc. Crew leader area CL-code Name RO responsible Landmark -- ---

44 Example of Relations EA entity can be linked to the entity crew leader area. The table for this entity could have attributes such as the name of the crew leader, the regional office responsible, contact information, and the crew leader code (CL code) as primary code, which is also present in the EA entity. R EA EA-code Area Pop. Crew leader area CL-code Name RO responsible 1-1 1-N

45 Entity: Enumeration areas
Type (attributes) EA-code Area Pop CL-code … … … Identifier

46 Components of a digital EA database
Boundary database

47 A Simpler Alternative In many countries, EA map design may be simpler than in this example Instead of a fully integrated digital base map in vector format, rasterized images of topographic maps may be used as a backdrop for EA boundaries In some instances, map features may be more generalized, for instance by using only the centerlines for the streets and polygons for entire city blocks rather than for individual houses This can include the use of free data as a baseline or starting point in the creation or updating of census related maps

48 Data Representation Raster Vector Real World

49 Two Fundamental Types of Data
GIS work with two fundamentally different types of geographic information Vector Raster (or Grid) Both types have unique advantages and disadvantages A GIS should be able to handle both types

50 Vector vs Raster or Discrete vs Continuous
River x1,y1 Raster-based analysis: Area of analysis divided into squares of uniform size. - Each cell characterizes the feature of interest within this area with a single value. Vector data: Coordiante-based data structure commonly used to represent linear features. Each feature is represented as a list of ordered x,y coordinates xn,yn

51 Raster Data A raster image is a collection of grid cells - like a scanned map or picture Raster data is extremely useful for continuous data representation elevation slope modeling surfaces Satellite imagery and aerial photos are commonly used raster data sets Easier data structure Certain analysis operations are more easily implemented Good for representing continuous variation (vegetation, countour lines, etc.)

52 Vector Data Vector data are stored as a series of x,y coordinates
Good for discrete data representation points: wells, town centroids lines: roads, rivers, contours polygons: enumeration areas, districts, town boundaries, building footprints

53 Raster-Vector conversion (“vectorization”)
Computer algorithms are used to convert data of one type to the other: Vector --- Raster is trivial Raster ---- Vector is harder

54 Vector data + image (raster)

55 Vector: Points, lines, polygons
Set of geometric primitives: points lines polygons y node vertex x

56 II I Vector Structure Spaghetti Topology Network (graph)
On retrouve dans le modèle vecteur, plusieurs niveaux sémantiques: du plus structuré (topo) vers le plus simple (spaghetti) Modèle spaghetti Modèle réseau (graphe quelconque) Modèle topologique

57 Spaghetti File No Topology = raw file or ‘spagehetti file’
Lines not connected; have no ‘intelligence’

58 Example of “Spaghetti” data structure
6 Poly coordinates A (1,4), (1,6), (6,6), (6,4), (4,4), (1,4) B (1,4), (4,4), (4,1), (1,1), (1,4) C (4,4), (6,4), (6,1), (4,1), (4,4) A 5 4 3 B C 2 1

59 Topology Data structure in which each point, line and piece or whole of a polygon : “knows” where it is “knows” what is around it “understands” its environment “knows” how to get around Helps answer the question what is where?

60 Topology: Spatial Relationships
Left Polygon = A Right Polygon = B Node 1 = Chains A,B,C Chain A is connected to chains B & C Polygon B Contained within polygon A Adjacency Connectivity Containment

61 Example of Topological data structure
Node X Y Lines I ,2,4 II ,5,6 III ,3,5 IV ,3,6 1 Poly Lines A ,4,5 B ,4,6 C ,5,6 6 A 5 I II III 4 4 5 3 From To Left Right Line Node Node Poly Poly I III O A I IV B O III IV O C I II A B II III A C II IV C B 2 B C 3 6 IV 1 2 O = “outside” polygon

62 Encoding Topology (not): CAD

63 Encoding Topology: GIS

64 Comparison Advantages: Spaghetti Topology Set of independent objects
Representation of heterogonous objects within the same model Appropriate to CAD Pre-calculation of topological relations Maintenance of topological constraints correspondence with exchange formats

65 …cont. Disadvantages: Spaghetti Topology
Spatial Relationships calculated Risk of incoherence (duplication of common boundaries) High cost of up-to-date Many levels of indirections for complex objects Maintenance

66 Some well known Topological models
TIGER: Topologically Integrated Geographic Encoding and Referencing (Census Bureau of the USA) Line is the principal element to which are related points and area features ARC/INFO model: ESRI Point, Line, Polygon

67 Zip Codes Cities Voting Districts Census Tracts Counties MCD’s
TIGER Data: Polygon Zip Codes Cities Voting Districts Census Tracts Counties MCD’s Block Groups

68 TIGER Data: Line Railroads Streams Streets

69 TIGER Data: Point Zip+4 Centroids Place Names Landmarks Key Locations

70 Recapitulation on spatial models
Transformations between models: “vectorization” of raster images (costly) topology toward spaghetti (easy) spaghetti toward topology (possible but costly) The vector model most used, essentially topology; it’s useful to integrate raster and vector

71 Spatial Analysis: Query
select features by their attributes: “find all districts with literacy rates < 60%” select features by geographic relationships “find all family planning clinics within this district” combined attributes/geographic queries “find all villages within 10km of a health facility that have high child mortality” Query operations are based on the SQL (Structured Query Language) concept

72 Examples: What is at…? Features that meet a set of criteria

73 Spatial Analysis (cont.)
Buffer: find all settlements that are more than 10km from a health clinic Point-in-polygon operations: identify for all villages into which vegetation zone they fall Polygon overlay: combine administrative records with health district data Network operations: find the shortest route from village to hospital

74 Modeling/Geoprocessing
modeling: identify or predict a process that has created or will create a certain spatial pattern diffusion: how is the epidemic spreading in the province? interaction: where do people migrate to? what-if scenarios: if the dam is built, how many people will be displaced?

75 Spatial relationships
Logical connections between spatial objects represented by points, lines and polygons e.g., - point-in-polygon - line-line - polygon-polygon Fundamental Questions addressed by GIS What is in a particular place? Find the position of some object What is the area of, the distance from specific object? Find patterns- by looking at the distribution of features on the map instead of just an individual feature Combination of data layers Point in polygon operation Line in polygon operation Polygon overlay

76 Spatial Operations “adjacent to” “connected to” “near to”
“intersects with” “within” “overlaps” etc.

77 “is nearest to” Point/point
Which family planning clinic is closest to the village? Point/line Which road is nearest to the village Same with other combinations of spatial features

78 “is nearest to”: Thiessen Polygons

79 “is near to”: Buffer Operations
Point buffer Affected area around a polluting facility Catchment area of a water source

80 Buffer Operations Line buffer
How many people live near the polluted river? What is the area impacted by highway noise

81 Buffet Operations Polygon buffer
Area around a reservoir where development should not be permitted

82 “ is within”: point in polygon
Which of the cholera cases are within the containment area

83 Problem: We may have a set of point coordinates representing clusters from a demographic survey and we would like to combine the survey information with data from the census that is available by enumeration areas. Solution: “Point-in-Polygon” operation will identify for each point the EA area into which it falls and will attach the census data to the attribute record of that survey point.

84 Polygon Overlay

85 “overlaps”: Polygon overlay

86 Data Layers A GIS combines layers of information about a place to give you a better understanding of that place. What layers of information you combine depends on your purpose. GIS uses geography, or space, as the common key element between datasets. Information is linked only if it relates to the same geographic area. - Data Layers: space as an indexing system. - A GIS combines layers of information about a place to give you a better understanding of that place. What layers of information you combine depends on your purpose.

87 Spatial aggregation Example of Spatial aggregation:
fusion of many provinces constituting an economic region

88 Spatial data transformation: interpolation
Example 1: Based on a set of station precipitation surface estimates, we can create a raster surface that shows rainfall in the entire region 13.5 20.1 26.0 27.2 12.7 15.9 24.5 26.1

89 GIS capabilities: Visualization

90 Implementing a GIS Consider the strategic purpose
Plan for the planning Determine technology requirements Determine the end products Define the system scope Create a data design Choose a data model Determine system requirements Analyze benefits and costs Make an implementation plan Source: Thinking About GIS, Third Edition Geographic Information System Planning for Managers

91 GIS: Enables us to handle very large amounts of data
Example: census data – thousands of EAs – hundreds of variables – many complementary data layers (roads, rivers, public facilities) Example: remote sensing – satellites send huge amounts of data that need to be processed, interpreted and stored

92 GIS: Helps to make data re-usable and useful to many more users
Census geography – EA maps do not have to be redrawn every time, only updated – census information can be used for many more applications – data sharing among agencies

93 In Conclusion GIS for inventory/visualization
GIS creates maps from data pulled from databases anytime to any scale for anyone GIS for database management GIS for spatial analysis/modeling GIS a tool to query, analyze, and map data in support of the decision making process. GIS can be viewed three ways: The Database View The Map View The Model View Improve Organizational Integration A GIS can link data sets together by common locational data, such as addresses, which help departments and agencies share their data. By creating a common database, data can be collected once and used many times.

94 What is Not GIS GPS – Global Positioning System …not just software!
…not just for making maps! Maps are an input data to and a “product” of a GIS A way to visualize the analysis

95 Literature related to Census Mapping & GIS
US National Research Council: Tools and Methods for Estimating Populations At Risk David Martin (1996) Geographic Information Systems: Socioeconomic Applications Longley and al, Wiley (2005) Geographic Information Systems and Science, second edition ESRI Press: Unlocking the Census with GIS Mapping the Census 2000

96 Contact Information: Demographic Statistics Section UN Statistics Division New York

97 Compromise projections
Comoromise projections: Do not preserve any proprety but reprsent a good compromise between the different objectives. E.g. Robibson’s projection of the world

98 Vector to Raster Conversion: Polygons
b a c

99 Vector to Raster Conversion: Lines

100 Raster to Vector Conversion: Polygons

101 Raster to Vector Conversion: Polygons


Download ppt "GIS Fundamentals/ Geographic Database Design"

Similar presentations


Ads by Google