Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of California – Santa Barbara www.nceas.ucsb.edu Rick Reeves / March 17, 2005 Geospatial Data and Spatial Data Analysis Tools For Ecologists.

Similar presentations


Presentation on theme: "University of California – Santa Barbara www.nceas.ucsb.edu Rick Reeves / March 17, 2005 Geospatial Data and Spatial Data Analysis Tools For Ecologists."— Presentation transcript:

1 University of California – Santa Barbara www.nceas.ucsb.edu Rick Reeves / March 17, 2005 Geospatial Data and Spatial Data Analysis Tools For Ecologists

2 Presentation Goals Overview: Geospatial Data Analysis Defining and distinguishing between spatial, geospatial, geographic data Addressing the particular attributes of geospatial data Inventory of Geospatial Data Types Primary data types and common sources for data Survey of Geoprocessing Software Tools Key issues driving choice of geospatial processing software A Tour of NCEAS Scientific Computing Web Site Spatial Datasets, Tools, Tutorials, and Project Archives Some Examples: Geospatial Data Analysis at NCEAS From the Annals of the NCEAS Scientific Programmer: ‘Real World’ solutions to Ecological research challenges

3 Meet the Scientific Programmer Rick’s Academic and Professional Background Undergraduate: Environmental Remote Sensing Graduate: Spatial Operations Research / Location-Allocation Heuristic Development Spatial Modeling branch of Geographic Data Analysis Problem Domain: Transportation and Facility Location within networks Professional: Software Development, geospatial database development, training curriculum development

4 Spatial Data: A Hierarchical Definition Spatial Data Observations are distributed in multidimensional space X / Y / Z coordinates attached to each data element Geospatial Data Spatial Data with attached Geographic coordinates Latitude / Longitude, UTM Optional: data subjected to a map projection transformation Geographic Data Geospatial Data that captures ‘Earth System’ phenomena Terrain height Drainage Network Land surface cover or urban Land Use Meteorological / climate data forecasts Ecologists may work with any or all during a project

5 Overview: Geospatial / Geographic Data Two Broad Primary Categories Raster: A multi dimensional, regularly-spaced grid of values (samples) Dimensions: Northing, Easting, Altitude, Time Examples: Satellite Image, Digital Terrain, land surface cover maps Vector: Three primary shapes stored in drawing-optimized format Point, Line, Polygon, (TIN, vector field) Thousands of datasets exist in hundreds of formats Remote Sensing Imagery / Digital Elevation Models Surface Features (political, physiographic) as points/lines/polygons Meteorological data (observed / forecasted (short-and long-term)) File format standards set by Industry, Government, user community Data Ingestion: First Step in Geospatial Analysis Data input / format conversion / spatial registration

6 Geospatial Data Analysis Geospatial Information Analysis: 3 Categories From O’Sullivan & Unwin (2003) Spatial Data Manipulation: Investigate the relationships between geographic dataset layers Examples: ‘point-in-polygon’, buffer zones around spatial features GIS software typically used to view/ manipulate / create layers Spatial/Statistical Data Analysis: Descriptive and Explanatory: What is there? How do we categorize it? Data points treated as statistical ‘population’, compared to others Spatial Modeling: Construct models to explore and understand geospatial systems Based on ‘abstraction’ of domain-specific problem into a systems framework. Some examples: Predicting network flows; optimizing facility locations among demands Lessons learned building model as valuable as model’s ‘answers’

7 The Challenge of Geospatial Analysis Geospatial Data violate some key statistical assumptions Must be addressed in the experimental design and sampling scheme Require specialized assessment techniques to factor out effects Spatial Autocorrelation Samples are NOT randomly selected from normally-distributed population In fact, nearby samples more likely to be similar than distant ones Autocorrelated data points introduce redundancy into the sample set Spatial Scaling AKA Modifiable Areal Unit Problem Statistical relationships in an area may change at different aggregations The placement of sampling grid can introduce artifacts Nonuniform sampling space, edge effects Geospatial Data Attributes have explanatory power Spatial relationships may be causes for observed phenomena

8 Selecting Geospatial Software Tools Geospatial software: layered software architecture Data layer: Efficiently store geospatial data Feature Set + spatial coordinates Analytic Layer: Spatial/statistical analysis algorithms Statistical packages increasingly contain geospatial analysis tools Visualization Layer: Creates data views (AKA maps) Geospatial tools broadly divided in two categories Geographic Information Systems (GIS) Three software layers are each extensive, ‘feature rich’ Geospatial Analysis Packages Data layer is ‘thinner’, Analytic layer ‘thicker’ Visualization layer built on existing data plotting tools

9 Geospatial Software Tools: GIS ‘Value Added’ Data layer is optimized for efficient geospatial data storage/processing Raster and Vector Data storage, ‘mixed mode’ operations Georeferencing tools for data layer projection, spatial registration Map Algebra tools foster analysis and creation of data layers Comprehensive cartographic tools for output map design

10 Geospatial Software Tools: GIS Caveats Underdeveloped geostatistical processing tools Vendors pressured to include them in product Yet validation data and algorithm details not available Often, these are critical tools for ecological analysis Steep Learning Curve Identifying, mastering ‘essential’ features a challenge Cost: GIS Software can be expensive Upfront purchase and yearly license fees Time investment in training and data maintenance Workload If non-GIS must be used for part of analysis, time must be spent moving between s/w packages

11 Geospatial Software Tools: GIS Caveats Underdeveloped geostatistical processing tools Vendors pressured to include them in product Yet validation data and algorithm details not available Often, these are critical tools for ecological analysis Steep Learning Curve Identifying, mastering ‘essential’ features a challenge Cost: GIS Software can be expensive Upfront purchase and yearly license fees Time investment in training and data maintenance Workload If non-GIS must be used for part of analysis, time must be spent moving between s/w packages

12 Geospatial Software Tools: Choosing Some Suggested Selection Criteria Research Objectives should drive choice of tools Identify the project’s core geospatial processing needs Platform Flexibility Select tools supported on multi-platforms (hardware/OpSys) Widely supported/used platforms foster collaberation Solution ‘Visibility’ Can you obtain the details of the algorithm? Does the community recognize the accuracy of the algorithm? Costs of implementing your research idea in software Scripted solutions using integrated environments are best R, SAS, MATLAB Avoid development in high-level programming languages

13 Geospatial Software Tools: Choosing Select GIS for core needs: Construct, compare, create multiple spatial data layers Simultaneously analyzing vector and raster data Creating detailed production quality study site maps Your data is exclusively in the GIS product format You require spatial analysis tools unavailable outside GIS Select Geospatial Analysis tools for core needs: Spatial/Statistical data analysis is the focus Your mapping requirements are modest two-dimensional data plots with geographic coordinates, legend You need in-depth understanding of algorithms used Or, you wish to extend / modify the algorithms

14 Sources for Geospatial Software Tools Commercial Software Products For-profit corporations sell or license their software Major players produce comprehensive products ESRI ArcGIS is the dominant GIS vendor Their goal: Provide solution for every geospatial application Other vendors offer tailored solutions Examples: ENVI / IDL, ERDAS: Remote Sensing oriented GIS Example: S Plus Spatial Statistics: Geospatial statistics and spatial data visualization enhancements to statistical package Example: MATLAB has mapping and image processing toolkits Example: SAS offers GIS, geospatial software tools Commercial products often drive geospatial data formats Example: ESRI Shape File, ERDAS IMG file

15 Sources for Geospatial Software Tools Open Source Software Broad-based effort by worldwide scientific and research community Distributed under General Public License (GPL) Software development and maintenance by the user community Most significant geospatial analysis products: R, GRASS GIS Examples of others: PostGIS, GDAL libraries Visit FreeGIS.org, or the open software foundation sites.

16 Tradeoffs: Commercial GIS Software Centralized documentation and product support….. At a price of $100s to $1000s per year Comprehensive, integrated software product Data/Analytic/Visualization layers populated w/ features Steep learning curve: Where are my ‘essential features?’ Training always available – at a cost…. Details of proprietary geospatial algorithms usually unavailable

17 Tradeoffs: Open Source GIS Software Open Source Software Distributed under General Public License (GPL) Software development and maintenance by the user community Most significant geospatial analysis products: R, GRASS GIS Many applications available via the Internet but…. Quality, features, support, and documentation are inconsistent Algorithms and even source code are freely available Open Source software drawbacks are shrinking as user support community evolves and matures But active participation in the community is advised for those wishing to stay technically proficient

18 Sources for Geospatial Data Government Agencies National Mapping and Survey Agencies: surface cover data USGS Research Centers: Climate forecasting models NOAA, NASA, NCDC For-Profit Corporations The highest-quality UNCLASSIFIED imagery now acquired by the private sector Sometimes, no-cost government data is resold to public Data widely available via the Internet Many data sets available at no- or low-cost Notable Exception: Satellite Remote Sensing data Some discounts available to education and/or research entities The best sites allow ‘search by geographic coordinates’ Examples from NCEAS Scientific Computing web site

19 Popular Geospatial Data Formats Meteorological and Climatalogical Data Historical measurements Short-term model-based forecasts (3 – 10 days from now) Long-term predictions (10 – 100 years): General Circulation Models Widely-Used Formats: Gridded Binary (GRIB), NetCDF Political and Physiographic features Country Boundaries Road Networks Drainage Networks Widely-Used Formats: Digital Line Graphs (DLG), ESRI Shape Files (.shp) Most GIS/Geospatial packages ingest these formats Or conversion utilities are available to ingest them

20 Popular Geospatial Data Formats Remote Sensing Imagery Many operational systems provide many kinds of images Multispectral Imagery: Landsat, SPOT, IKONOS Data Formats tend to be sensor-specific Most GIS can ingest most imagery types Portal sites Commercial: http://www.vterrain.org/Imagery/commercial.htmlhttp://www.vterrain.org/Imagery/commercial.html Govt: http://www.nationalgeographic.com/maps/map_links.htmlhttp://www.nationalgeographic.com/maps/map_links.html Digital Terrain Models Raster Grid datasets containing elevation measurements Available for complete Earth land surface Primary format: USGS Digital Elevation Model (DEM) AKA National Elevation Dataset (NED) Portal sites: USGS: http://gisdata.usgs.net/Website/Seamless/http://gisdata.usgs.net/Website/Seamless/ Terrainmap.org: http://www.terrainmap.org/http://www.terrainmap.org/

21 Tour of the Scientific Computing Web Site Links to Data Sources Links to Geospatial Software Sources Links to Tutorials and Research Papers Archive of NCEAS Research Projects http://www.nceas.ucsb.edu/scicomp

22 Example: Spatial Modeling: Optimization Route vehicles along network using environmental costs as a metric Simultaneously locate facilities along shipment routes that mitigate environmental costs Optimal Location of species reserve sites Develop and compare performance of alternate solution methods Mathematically optimal but operationally impractical Heuristically derived Near-optimal, usable solution

23 Spatial Modeling: The Problem Domain

24 Geospatial Dataset: Routes + Locations

25 Spatial Model Solution: Alternative Methods

26 Selecting Species Reserves Locations Dr. Ross Gerrard, UCSB Biogeography Lab, 1996

27 Example: Spatial Data Manipulation Elevation zone threshold calculation Digital Elevation Models for selected worldwide sites Classify sites into 100 meter ‘wide’ elevation zones General Circulation Model climate data extraction Identify, obtain, import GCM data files Import the data into GIS as raster grid Overlay point file, extract matching climate values

28 Digital Elevation Data Ingestion / Clipping

29 Elevation Zone Data Analysis

30 General Circulation Model data extraction

31 Spatial Analysis: Arc GIS and R Platforms ESRI Shape files exported to the R programming environment R Geostatistical and Spatial Analysis methods can then be applied

32 A Sampling: R Geospatial Analysis packages clim.pact: Climate data analysis and downscaling tools GeoR: Geostatistical Data Analysis: variograms, et. al maptools: read/manipulate polygon data (ESRI.shp) shapefiles: read/manipulate ESRI shape files sgeostat: Geostatistical modeling code splancs: Spatial and space-time point patterns spstat: Spatial Point Pattern analysis

33 Concluding thoughts NCEAS Associates are extensively use geospatial data in many creative ways Geospatial Data Analysis requires specialized techniques GIS and geospatial analysis available from commercial vendors and open source community Choosing geospatial data and tools can be overwhelming and distract from the primary ‘science mission’ Scientific Programming Team has geospatial expertise, and can assist NCEAS Associates in this domain Coming soon: Short course on the R Programming Language!


Download ppt "University of California – Santa Barbara www.nceas.ucsb.edu Rick Reeves / March 17, 2005 Geospatial Data and Spatial Data Analysis Tools For Ecologists."

Similar presentations


Ads by Google