Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced GIS Topic 1 Starting Jan. 16, 2007. Outlines About the class setting Materials to be covered and scheduled Quick review of GIS basics First lab.

Similar presentations


Presentation on theme: "Advanced GIS Topic 1 Starting Jan. 16, 2007. Outlines About the class setting Materials to be covered and scheduled Quick review of GIS basics First lab."— Presentation transcript:

1 Advanced GIS Topic 1 Starting Jan. 16, 2007

2 Outlines About the class setting Materials to be covered and scheduled Quick review of GIS basics First lab

3 Materials to be covered and scheduled Review (week 1,2) Geodatabase lab 1 Spatial data analysis (3,4,5) Vector data analysis (lab 2) Raster data analysis Basic (lab 3) Watershed delineation (lab 4) Geostatistic analysis (6,7,8) Lab5,6 3-D analysis (10,11,12) Lab7,8 Geoprocessing (13,14,15) Lab9,10

4 What is GIS ? A computer system for - collecting, - storing, - manipulating, - analyzing, - displaying, and - querying geographically related information.

5 In general GIS cover 3 components Computer system Hardware Computer, plotter, printer, digitizer Software and appropriate procedures Spatially referenced or geographic data People to carry out various management and analysis tasks

6 Geographic Data Geospatial data tells you where it is and attribute data tells you what it is. Metadata describes both geospatial and attribute data. In GIS, we call geographic data as GIS data or spatial data

7 Traditional method To represent the geographic data is paper-based maps Geology map Topographic map City street map (we still use it a lot)...

8 Characteristics of spatial data “mappable” characteristics: Location (coordinate system, will be lectured later) Size is calculated by the amount (length, area, perimeter) of the data Shape is defined as shape (point, line, area) of the feature Discrete or continuous Spatial relationships

9 Discrete and continuous Discrete data are distinct features that have definite boundaries and identities A district, houses, towns, agricultural fields, rivers, highways, … Continuous data has no define borders or distinctive values, instead, a transition from one value to another Temperature, precipitation, elevation,...

10 GIS: a simplified view of the real world Points Lines Areas Networks A series of interconnecting lines Road network River network Sewage network Surfaces Elevation surface Temperature surface Discrete features Continuous features

11 Problems caused by the simplified features may still exist, but let’s live on it Dynamic nature (not static) Forest grow River channel change City expand or decline Identification of discrete and continuous features Road to be a line or a area? Scale Some may not fit to any type of features: fuzzy boundaries Transition area between woodland and grassland Lets do not worry about these problems now!!! Just keep in mind

12 Points A point is a 0 dimensional object and has only the property of location (x,y) Points can be used to Model features such as a well, building, power, pole, sample location ect. Other name for a point are vertex, node Point

13 Lines A line is a one-dimensional object that has the property of length Lines can be used to represent road, streams, faults, dikes, maker beds, boundary, contacts etc. Lines are also called an edge, link, chain, arc In an ArcInfo coverage an arc starts with a node, has zero or more vertices, and ends with a node Line

14 Areas (Polygons) A polygon is a two-dimensional object with properties of area and perimeter A polygon can represent a city, geologic formation, dike, lake, river, ect. Other name for polygons face, zone Area

15 Topology needed A collection of numeric data which clearly describes adjacency, containment (coincidence), and connectivity between map features and which can be stored and manipulated by a computer. A set of rules on how objects relate to each other Major difference in file formats Higher level objects have special topology rules

16

17 How Topology Works We previously discussed that lines represent linear features, or borders for area features. We also said that every line starts and ends with a node, and has intermittent shape points called vertices to define the shape of the line or border. So when you think about it, lines don’t really exist. They simply represent a relationship between two nodes and zero or more vertices. When two lines cross, and form an intersection, they also have a node, since the intersection is the start of one line and the end of the other line. Topology describes the connectivity of the lines and nodes. So for our example on the right, lines A and B are connected by node b. So line A goes from node a to node b. Line B goes from node b to node c. Now, we can create a whole string of lines and put them together into an area too. Now, just like a line, polygons don’t really exist. They simply represent the relationship among lines, which in turn represent the relationship among points. Node Line Polygon b A B a c ©Arthur J. Lembo Cornell University

18 How Topology Works Now we have described our location (with x,y coordinates), and our connectivity. What if we had two polygons P1 and P2, could we define the adjacency? Yes, here is how: Line 1 goes from node a to node b. Line 2 goes from node a to node b. Line 3 goes from node b to node a. Polygon P1 is to the left of line 2, and to the right of line 1. Polygon P2 is to the right of line 2, and to the right of line 3. So, we can create a table that “clearly describes location, adjacency, connectivity and containment, or more specifically, a topology table. a b P1 P2 2 1 3 Polygon Lines P1 1,2 P22,3 Line FromNode ToNode LeftPolygon RightPolygon 1 a b0 P1 2 a b P1 P2 3 b a 0 P2 ©Arthur J. Lembo Cornell University

19 Traversing Topology Without looking at the picture, you can answer these questions from the table: Where is node a. No problem. It has an x,y coordinate What polygon is P1 next to, and where are they adjacent: P1 is next to P2 because Line 2 has polygon P1 to the left and P2 to the right. This is adjacency. How do I traverse from node b, to node a, and then back to node b: Easy! Take line 3 to node a, and you have a choice to take either line 2 or 3 back to node b. This is connectivity. What lines does polygon P1 fall inside of: Easy! Polygon P1 is contained by lines 1 and 2. This is containment a b P1 P2 2 1 3 Line FromNode ToNode LeftPolygon RightPolygon 1 a b0 P1 2 a b P1 P2 3 b a 0 P2 Polygon Lines P1 1,2 P22,3 ©Arthur J. Lembo Cornell University

20 Topology © Paul Bolstad, GIS Fundamentals

21 Two basic data models to represent these features Raster spatial data model Define space as an array of equally sized cells arranged in rows and columns. Each cell contains an attribute value and location coordinates Individual cells as building blocks for creating images of point, line, area, network and surface Continuous raster Numeric values range smoothly from one location to another, for example, DEM, temperature, remote sensing images, etc. Discrete raster Relative few possible values to repeat themselves in adjacent cells, for example, land use, soil types, etc. Vector spatial data model Use x-, y- coordinates to represent point, line, area, network, surface Point as a single coordinate pair, line and polygon as ordered lists of vertices, while attributes are associated with each features Usually are discrete features

22 DIGITAL SPATIAL DATA RASTER VECTOR Real World Source: Defense Mapping School National Imagery and Mapping Agency

23 Raster and Vector Data Models Vector Representation X-AXIS 500 400 300 200 100 600 500 400 300 200 100 Y-AXIS River House 600 Trees B B BB B B B B G G BK B B B G G G G G Raster Representation 12345678910 1 2 3 4 5 6 7 8 9 Real World G Source: Defense Mapping School National Imagery and Mapping Agency

24 Example: Discrete raster

25 Xie et al. 2005 Example: continuous raster

26 RasterReal world Vector Heywood et al. 2006

27 Effects of changing resolution Heywood et al. 2006

28 Vector – Advantages and Disadvantages Advantages Good representation of reality Compact data structure Topology can be described in a network Accurate graphics Disadvantages Complex data structures Simulation may be difficult Some spatial analysis is difficult or impossible to perform

29 Raster – Advantages and Disadvantages Advantages Simple data structure Easy overlay Various kinds of spatial analysis Uniform size and shape Cheaper technology Disadvantages Large amount of data Less “pretty” Projection transformation is difficult Different scales between layers can be a nightmare May lose information due to generalization

30 GIS data formats (files) Shapefiles Coverages TIN (e.g. elevation can be stored as TIN) Triangulated Irregular Network Grid (e.g. elevation can be stored as Grid) Image (e.g. elevation can be stored as image, all remote sensing images) Vector data Raster data

31 Shape Files Nontopological Advantages no overhead to process topology Disadvantages polygons are double digitized, no topologic data checking At least 3 files.shp.shx.dbf

32 Coverages Original ArcInfo Format Directory With Several Files Database Files are stored in the Info Directory Uses Arc Node Topology Containment (coincident) Connectivity Adjacency

33 TIN A triangulated irregular network (TIN) is a data model that is used to represent three dimensional objects. In this case, x,y, and z values represent points. Using methods of computational geometry, the points are connected into what is called a triangulation, forming a network of triangles. The lines of the triangles are called edges, and the interior area is called a face, or facet. While the TIN model is somewhat more complex than the simple point, line, and polygon vector model, or the raster model, it is actually quite useful for representing elevations. For example a raster grid would require grid cells to cover the entire surface of a geographic area. Also, if we wanted to show great detail we would have to have small grid cells. Now, if the land area is relatively flat, we would still need the small grid cells. However, with a TIN we would not have to include so many points on the flat areas, but could add more points on the steep areas where we want to show greater detail. The illustration shows how we can create a TIN of the terrain around Ithaca, NY. First, a series of elevation points are created Second, a TIN face is created with the elevation data Third, the faces are shaded in to give the impression of a 3D surface ©Arthur J. Lembo Cornell University

34 Components of a TIN Nodes Edges Triangles Hull Topology ©Arthur J. Lembo Cornell University

35 Grid Properties Each Grid Cell holds one value even if it is empty. A cell can hold an index standing for an attribute. Cell resolution is given as its size on the ground. Point and Lines move to the center of the cell. Minimum line width is one cell. Rasters are easy to read and write, and easy to draw on the screen.

36 A new data model in ArcGIS Geodatabase data model Use a relational database that stores geographic datarelational database A type of database in which the data is organized across several tables. Tables are associated with each other through common fields. Data items can be recombined from different files. A container for storing spatial and attribute data and the relationships that exist among them And their associated attributes can be structured to work together as an integrated system using rules, relationships, and topological associations

37 Geodatabase components- vector data and table Primary (basic) components - feature classes, - feature datasets, - nonspatial tables. complex components building on the basic components: - topology, - relationship classes, - geometric networks

38 Geodatabase components- Raster data Raster data referenced only in personal geodatabase Raster data physically stored in multiuser geodatabse Raster datasets and raster catalogs A raster dataset is created from one or more individual rasters. When creating a raster dataset from multiple rasters, the data is mosaicked, or aggregated, into a single, seamless dataset in which areas of overlap have been removed. The input rasters must be contiguous (adjacent) and have the same properties, including the same coordinate system, cell size, and data format. For each raster dataset (.img, grid, JPEG, MrSID, TIFF), ArcGIS creates an ERDAS IMAGINE file (.img). A raster catalog is defined as a table in the geodatabase which you can view like any other table in ArcCatalog. Each raster in the catalog is represented by a row in the table. It contains a collection of rasters that can be noncontiguous, stored in different formats, and have other different properties. In order to view all the rasters in the catalog, they must have the same coordinate system and a common geographic extent

39

40 Attribute data Attribute data is about “what” of a spatial data and is a list or table of data arranged as rows and columns Rows are records (map features) Each row represents a map feature, which has a unique label ID or object ID Columns are fields (characteristics) Intersection of a column and a row shows the values of attributes, such as color, ownership, magnitude, classification,…

41 Data types of attribute data: character, integer, floating, date Each field must be defined with a data type, data width, number of decimal places The width refers to the number of space reserved for a field

42

43 examples

44 A database needed If many fields related to one record (feature-ID), for example, the a soil unit can have over 80 estimated physical and chemical properties, more tables are needed to store all the attributes. A database management system (DBMS) is needed to manage multiple tables. A database is a collection of interrelated tables in digital format. There are four types: Flat file, hierarchical database, network database, relational database In GIS, we usually use relational database

45 Flat file Hierarchical Network Relational PIN: Parcel ID number Zoning (zonecode): 1-residential, 2-commercialChang, 2004

46 Relational database A relational database is a collection of tables, also called relations, which can be connected to each other by keys. A primary key represents one or more attributes whose values can uniquely identify a record in a table. Its counterpart in another table for the purpose of linkage is called a foreign key Advantages Each table in the database can be prepared, maintained, and edited separately from other tables Efficient data management and processing, since linking tables query and/or analysis is often temporary

47 Three tables linked by keys Registration: Student#Class# 1022101-07 1022143-01 1022159-02 4123211-01 4123211-02 4123214-01 Student#Advisor 1022Jones 4123Smith Students: Faculty: NameRoom Jones412 Smith216

48 Four tables linked by keys Chang, 2004

49 Relationship of those separate tables One record in one table related to one record in another table One record in one table related to many records in another table Many records in one table related to one record in another table Many records in one table related to many records in another table

50 Join and relate tables Join relate Once tables are separated as relational tables, then two operations can be used to link those tables during query and analysis Join, brings together two tables based on a common key. Relate, connects two tables (based on keys) but keeps the tables separate. Keys do not have to have the same name but must be of the same data type

51 One-to-One Join Employee-idJob 1Digislave 2Useless Supervisor Employee-idname 1Tom 2John After join Employee-idJobName 1DigislaveTom 2Useless SupervisorJohn Join Employee-id to Employee-id

52 Many-to-One Join SymbolDescription QaQuaternary Alluvium QeQuaternary Eolian PaPermian Abo Polygon IdSymbol 1Qa 2 3Pa 4Qe Polygon IDSymbolDescription 1QaQuaternary Alluvium 2QaQuaternary Alluvium 3PaPermian Abo 4QeQuaternary Eolian After Join on Symbol

53 One-to-Many Relates FormationSymbol Quaternary AlluviumQa Permian AboPa SymbolMineral QaQuartz PaQuartz QaGypsum PaFeldspar If the tables are related on Symbol, selecting Polygon-id 1 will select the highlighted areas.

54 Many-to-Many Relates FormationSymbol 1Qa 2 SymbolMineral QaQuartz PaQuartz QaGypsum PaFeldspar If the tables are related on Symbol, selecting Polygon-id 1 will select the highlighted areas.

55 In ArcGIS GIS Those separate tables will have one and only one table called spatial table (or layer attribute table), which has spatial location and relationship with the spatial data. Other tables called nonspatial tables, which can be either join or relate to the spatial table. Join tables when each record in the spatial table has no more than one matching record in the nonspatial table One to one relation Many to one relation Relate tables when each record in the spatial table has more than one record in the nonspatial table One to many relation Many to many relation

56

57 The joined table The joined table will only preserved within the map document-the tables remain separate on disk-and can be removed at any time

58

59 Related tables The related table will only preserved within the map document-the tables remain separate on disk-and can be removed at any time

60 Geodatabase Before geodatabase, in one GIS project, many GIS files (spatial data and nonspatial data) are stored separated. So for a large GIS project, the GIS files could be hundreds. Within a geodatabase, all GIS files (spatial data and nonspatial data) in a project can be stored in one geodatabase, using the relational database management system (RDMS)

61 Types of geodatabases personal enterprise

62 Personal Geodatabase The personal geodatabase is given a name of filename.mdb that is browsable and editable by the ArcGIS, and it can also be opened with Microsoft Access. It can be read by multiple people at the same time, but edited by only one person at a time. maximum size is 2 GB.

63 Multiuser Geodatabase Multiuser (ArcSDE or enterprise) geodatabase are stored in IBM DB2, Informix, Oracle, or Microsoft SQL Server. It can be edited through ArcSDE by many users at the same time, is suitable for large workgroups and enterprise GIS implementations. no limit of size. support raster data.

64 3-tier ArcSDE client/server architecture with both the ArcSDE and Oracle RDBMS running on the same server, which minimizes network traffic and client load while increasing the server load compared to 2-tier system, in which the clients directly connect to the RDBMS

65 Personal and Multiuser Geodatabase Comparison source: www.esri.com

66 What is metadata Meta is defined as a change or transformation. Data is described as the factual information used as a basis for reasoning. Put these two definitions together and metadata would literally mean "factual information used as a basis for reasoning which describes a change or transformation." In GIS, Metadata is data about the data. It consists of information that describes spatial data and is used to provide documentation for data products. Metadata is the who, what, when, where, why, and how about every facet of the spatial data.Metadata According to the Federal Geographic Data Committee (FGDC), metadata is data about the content, quality, condition, and other characteristics of data.

67 Why use and create metadata To help organize and maintain an organization's spatial data - Employees may come and go but metadata can catalogue the changes and updates made to each spatial data set and how each employee implemented them To provide information to other organizations and clearinghouses to facilitate data sharing and transfer - It makes sense to share existing data sets rather than producing new ones if they are already available To document the history of a spatial data set - Metadata documents what changes have been made to each data set, such as changes in geographic projection, adding or deleting attributes, editing line intersections, or changing file formats. All of these could have an effect on data quality.

68 Metadata Should Include Data about Date of data collected. Date of coverage generated. Bounding coordinates. Processing steps. Software used RMSE, etc. From where original data came. Who did processing. Projection coordinate System Datum Units Spatial scale Attribute definitions Who to contact for more information See an example of non-standard metadata (see)see

69 Federal Geographic Data Committee’s (FGDC) Content Standard for Digital Geospatial Metadata (CSDGM) The FGDC is developing the National Spatial Data Infrastructure (NSDI) in cooperation with organizations from State, local and tribal governments, the academic community, and the private sector. The NSDI encompasses policies, standards, and procedures for organizations to cooperatively produce and share geographic data. The objectives of the CSDGM are to provide a common set of terminology and definitions for the documentation of digital geospatial data.

70 CSDGM (FGDC-STD-001-1998) Metadata = Identification_Information Data_Quality_Information Spatial_Data_Organization_Information Spatial_Reference_Information Entity_and_Attribute_Information Distribution_Information Metadata_Reference_Information Connect to http://www.fgdc.gov/metadata/csdgm/http://www.fgdc.gov/metadata/csdgm/

71 Metadata tools Metadata editors: - tkme / USGS - ArcCatalog / ESRI - SMMS / Intergraph - FGDCMETA / Illinois State Geological Survey - xtme / USGS Metadata utilities (check compliance and export to text, HTML,XML, or SGML): - mp / USGS - MP batch / Intergraph - ArcCatalog powered by mp/ ESRI Metadata Server - Isite / FGDC - GeoConnect Geodata Management Server / Intergraph - ArcIMS Metadata Server / ESRI mp: Metadata Parser

72 FGDC Clearinghouse the FGDC developed a clearinghouse that allows geospatial data creators to share their data however, the FGDC Clearinghouse is not a data repository. The data contained within the clearinghouse is actually stored on computer servers maintained by individual contributors. This allows contributors to manage their own data.

73 Two Components The FGDC Clearinghouse consists of 6 gateways and 250 nodes A gateway is a point of entry into the FGDC Clearinghouse A clearinghouse node is a database that contains metadata records. Individual contributors maintain nodes Besides the FGDC Clearinghouse, there are a variety of other communities that use FGDC-compliant metadata as the basis of their data sharing services. These so-called clearinghouse communities are often developed because the participating organizations have data of similar or complementary types. http://clearinghouse1.fgdc.gov/

74 First lab Creating, editing, and managing geodatabase for ArcGIS 9

75 30 minutes 25 minutes 45 minutes 15 minutes

76 COPY the result map of your last step to your home work

77 Copy your exam questions and result to your homework


Download ppt "Advanced GIS Topic 1 Starting Jan. 16, 2007. Outlines About the class setting Materials to be covered and scheduled Quick review of GIS basics First lab."

Similar presentations


Ads by Google