Presentation is loading. Please wait.

Presentation is loading. Please wait.


Similar presentations


1 A FRAMEWORK FOR GEOSPATIAL MODELING FROM SPARSE FIELD MEASUREMENTS USING IMAGE PROCESSING AND MACHINE LEARNING 1Peter Bajcsy, 1Chulyun Kim, 2Jihua Wang and 2Yu-Feng Lin 1National Center for Supercomputing Applications (NCSA) 2Illinois State Water Survey (ISWS) University of Illinois at Urbana-Champaign (UIUC)

2 Outline Introduction Problems Addressed by Spatial Pattern To Learn (SP2Learn) SP2Learn Architecture and Functionality Overview Running SP2Learn Summary

3 Introduction

4 General Problem Compute a set of geo-spatially dense accurate predictions of variables given a set of direct geo-spatially sparse point measurements and auxiliary variables with implicit relationships with respect to the predicted variable Motivation: minimize cost of taking direct point measurements maximize accuracy of predictions and automate discovering relationships among direct field measurements and indirect variables

5 Formulation Input: sets of geo-spatially sparse variables {Vi{pij}} & dense auxiliary variables & a priori tacit knowledge of experts Output: geo-spatially dense (raster) {Ok} Unknown: selection of methods & workflow of operations/methods & parameters of methods & relationships of auxiliary variables w.r.t Ok & quantitative metric of output goodness p2j Interpolations Mathematical models p1j V1 & V2 O1 Auxiliary Variables & Tacit Knowledge

6 Applied Problem Recharge and Discharge Rate Prediction Discharged
Bedrock elevation Discharged Recharged Water table elevation

7 Interdisciplinary Objectives
Ground Water (Hydrologic Science) View: Evaluation of Alternative Conceptual (implicit relationships) and Mathematical Models (explicit relationships) Accurate Prediction of Groundwater Recharge and Discharge Rates from Limited Number of Field Measurements Computer Science View: Computer-Assisted Learning to Assess Alternative Conceptual and Mathematical Models Optimization of Prediction Models From a Set of Geo-Spatially Sparse Point Measurements DIALOG

8 State-of-the-Art Results
Limited Spatial Resolution and Accuracy Min. Grid: 805mX805m Recharge zone Noisy pattern or weak R/D Discharge zone Discharge Recharge Uniform Grid: 80mX80m

9 Existing Software for Groundwater and Surface Water Modeling
MODFLOW is a three-dimensional finite-difference ground-water model - freeware (2005) PEST - is software for model calibration, parameter estimation and predictive uncertainty analysis - freeware (2007); University of Queensland, Australia Precipitation-Runoff Modeling System (PRMS) – is deterministic, distributed-parameter modeling system developed to evaluate the impacts of various combinations of precipitation, climate, and land use on streamflow, sediment yields, and general basin hydrology - freeware (1996); USGS Deep Percolation Model (DPM) - facilitates estimation of ground-water recharge under a large range in climatic, landscape, and land-use and land-cover conditions USGS

10 Related Work Singh A. et al. “Expert-Driven ‘Perceptive’ Models for Reducing User Fatigue in an Interactive Hydrologic Model Calibration Framework” Conductivity (K) and Hydraulic heads (H) for the hypothetical aquifer

11 Motivation Ground Water (Hydrologic) Science: Computer Science :
Currently, there is no single method that could estimate R/D rates and patterns for all practical applications. Therefore, cross analyzing results from various estimation methods and related field information is likely to be superior than using only a single estimation method. Computer Science : It is currently impossible (a) to replace an expert with a lot of tacit domain knowledge by computer algorithms or (b) to learn by an expert new I/O relationships from a plethora of possible variables and an extremely large space of processing methods and their parameters Thus, assisting experts to discover, evaluate and validate new relationships in an iterative way will likely enable (a) better understanding of the underlying phenomena, and (b) more automated and cost-efficient predictions

12 Problems Addressed by Spatial Pattern To Learn

13 Our Approach Data-Driven Analyses to Test Alternative Models, and to Search the Space of Processing Operations and Their Parameters Interpolation methods Mathematical models Image processing algorithms Machine learning algorithms Scalability of algorithms with large size data Computer-Assisted Comparisons and Evaluations of Multiple Models and Sub-Optimal Solutions Model/Solution Representation Closed Loop (Iterative) Workflows Human Computer Interfaces Overall Approach: An Exploration Framework for a Class of Alternative Models/Hypotheses and Optimal Solutions

14 SP2Learn Problem Formulation
Given a set of geo-spatially sparse field measurements and auxiliary variables, derive accurate, spatially dense, R/D rate map by (a) using physics-based model (b) incorporating boundary conditions and (c) exploring auxiliary variables representing prior knowledge about R/D patterns but missing in the physics-based model

15 Challenges (1) How to Recognize ‘Meaningful’ Pattern of Predicted Map?
(2) How to Quantify the Goodness of the Pattern? Approach: (1a) Recognize patterns by utilizing multiple image enhancement and segmentation techniques applied to R/D rate predictions (1b) Introduce relationship between R/D pattern and auxiliary (a priori reference) information (2a) Define goodness w.r.t. reference information using expert’s selection of ‘meaningful’ relationships (2b) Define goodness w.r.t. reference information using complexity of machine learning

16 Using Physics-Based Model
R/D Rate Prediction Field Measurements + + + + + + + + + + + + + + + Discharged Recharged Water table elevation + Hydraulic conductivity + Incoming water Outgoing water Bed rock elevation + Ground water flux=hydraulic conductivity * cell area * gradient of water table elevation (head) over cell distance

17 Incorporating Spatial Boundary Conditions
BC: R/D rate prediction could have smooth transitions and recharge & discharge regions (contiguous pixels) should be clearly delineated Approach: Apply Image Restoration and De-noising Techniques Moving average based low pass filter TVL (Total Variation regularized L1-norm function) based filter Morphological operation based filter Using multiple techniques multiple times Discharged Recharged

18 Exploring Auxiliary Variables Driving R/D Patterns
Prior Tacit Knowledge about R/D and Auxiliary Variables Soil Type: P(R or D area/Soil=Clay)~low Slope: P(R or D area/ slope=high)~low Proximity to River: P(R or D area/River is close)~high moving average normalization+TVL normalization+TVL moving average

19 From Auxiliary Variables To Knowledge and Accurate R/D
Load R/D Map Load Variables Integrate Maps Apply Rules Create Decision Tree Define ROI

20 SP2Learn Output A set of rules that define relationships between predicted (R/D rate) variable and auxiliary variables Modified (more accurate) predictions according to the user selected rules defining relationships of predicted and auxiliary variables Sensitivity analysis results with respect to Methods (interpolations, image enhancement, …) Models Parameters

21 Example Results <RULE ID=138 NUM_OF_CASES=3975 SUPPORT=32.65%>
ROI <RULE ID=138 NUM_OF_CASES=3975 SUPPORT=32.65%> <IF>Elevation is not in { } AND Soil type is in {Rm=Roscommon muck} AND Proximity to water body is not {near_water} AND Slope is in {0-0.9} </IF> <THEN>R/D rate is ,-0.002</THEN> = +

22 SP2Learn Architecture and Functionality

23 Underlying SP2Learn Technology

24 SP2Learn Functionality Overview
Load Raster Step Integration Step Create Mask Step Rules Step Attribute Selection Step Apply Rule Step

25 SP2Learn Workflow

26 On-Line Help

27 Software and Test Data Download
Download web page of Image Spatial Data Analysis group at NCSA:

28 Running SP2Learn

29 Input Data to SP2Learn Raster files (maps) For mask creation
Predicted R/D rate models Auxiliary variables For mask creation Tables with geo-points Vector files with boundaries Raster files of categorical or continuous variables

30 Image Processing Filtering Methods Parameters
Low pass (moving average) filters Morphological filters TVL1 (Total Variation regularized L1 function) Using multiple techniques multiple times Parameters Kernel size (row dimension, column dimension)

31 Example Input Maps Low Pass Filter Morphological Closing
Morphological Opening Kernel = (10,10) Kernel = (10,10) Kernel = (10,10) Kernel = (5,5) Kernel = (5,5) Kernel = (5,5)

32 Example Auxiliary Maps
Slope DEM Soil River Stream

33 Loading Files Load R/D rate models (maps)
Load auxiliary maps to explore alternative models Proximity to water Soil type Slope

34 Mosaic Maps Large spatial coverage – a set of tiles
Out-of-core representation

35 Viewing Images Right mouse click Check boxes Image information Zoom
Pseudo-color Auto-fit images

36 Registration Integration of all maps (raster images) to a common projection and spatial resolution Before “Convert” After “Convert”

37 Create Mask C A Mask Parameters Visualization Panel B Mask Operations

38 Mask Creation Options in SP2Learn

39 User Defined Mask Creation
Set Parameter: User defined Mouse click-and-drag selection of region Click Paint and Show Click Apply

40 Label Editor Assign categorical labels to colors

41 Attribute Selection Output: Predicted Variable
Input: Auxiliary Variables Check-boxes Show Table Prune Tree

42 Decision Tree Based Modeling
Tree structure can be represented as a set of rules Discharge yes no Recharge Case A.. Case E.. Case J.. Distance from river ≤ 100 ft? Soil Type is {sand}?

43 Rules from Decision Tree
Num: Node number in a decision tree. Support(%): Among all cases satisfying conditions, the ratio of cases having the same class (conclusion). # of cases: The number of cases satisfying conditions Class: Conclusion of a rule Conditions: Conditions of a rule MDL Score: MDL score of a decision tree. The less the score is, the better the tree is

44 Show Decision Tree Show Tree Option

45 Export Rules XML format Export Rules Option

46 Apply Rules Visualization of Modified output variable Changed pixels
Magnitude of changes (differences)

47 Summary Novel Frameworks and Methodologies for Exploratory Data-Driven Modeling and Scientific Discoveries Problems addressed in the prototype SP2Learn solution: Prediction accuracy improvement by a combination of mathematical models and data-driven (knowledge based) models, supervised and unsupervised iterative model optimization Better Data Utilization!

48 Extra Information A stack of informatics and cyber-infrastructure software is open source Other software of potential interest: GeoLearn is an exploratory framework for extracting information and knowledge from remote sensing imagery CyberIntegrator to support creation of exploratory workflows, reuse of workflows, remote server execution, data and process provenance tracking and analysis, streaming data support Image Provenance to Learn (IP2Learn) to support decision processes based on visual inspection of images Load Estimation (work in progress) to support optimal sampling of sediment loads using several sediment-discharge rating curves, bias correction factors and Monte Carlo simulations to predict confidence limits Download web page of Image Spatial Data Analysis group at NCSA:

49 Acknowledgement Funding Agencies: Full Time Employees: Students:
NASA, NARA, NSF, NIH, NAVY, DARPA, ONR, NCSA Industrial Partners, NCSA Internal, COM UIUC, State of Illinois Full Time Employees: Peter Bajcsy, Rob Kooper, Sang-Chul Lee, Luigi Marini Students: Shadi Ashnai, Melvin Casares, Miles Johnson, Chulyun Kim, Qi Li, Tim Nee, Arlex Torres, Ryo Kondo, Henrik Lomotan, James Rapp Collaborators: College of Applied Health Sciences UIUC, Kinesiology Dept. UIUC, CEE UIUC, CS UIUC, GISLIS UIUC UIC, UC Berkeley, Univ. of Texas at Austin, Univ. of Iowa ISWS, NARA, Nielsen, State Farm Instituto Tecnológico de Costa Rica, UNESCO-IHE Netherlands

50 Thank you! Questions: Need More Details
Peter Bajcsy Need More Details Publications:

51 Backup


Similar presentations

Ads by Google