Download presentation

Presentation is loading. Please wait.

Published byJessica Connolly Modified over 2 years ago

1
1 Peter Bajcsy, 1 Chulyun Kim, 2 Jihua Wang and 2 Yu-Feng Lin 1 National Center for Supercomputing Applications (NCSA) 2 Illinois State Water Survey (ISWS) University of Illinois at Urbana-Champaign (UIUC) A FRAMEWORK FOR GEOSPATIAL MODELING FROM SPARSE FIELD MEASUREMENTS USING IMAGE PROCESSING AND MACHINE LEARNING

2
2 Outline Introduction Problems Addressed by Spatial Pattern To Learn (SP2Learn) SP2Learn Architecture and Functionality Overview Running SP2Learn Summary

3
3 Introduction

4
4 General Problem Compute a set of geo-spatially dense accurate predictions of variables given a set of direct geo-spatially sparse point measurements and auxiliary variables with implicit relationships with respect to the predicted variable Motivation: minimize cost of taking direct point measurements maximize accuracy of predictions and automate discovering relationships among direct field measurements and indirect variables

5
5 Formulation Input: sets of geo-spatially sparse variables {V i {p ij }} & dense auxiliary variables & a priori tacit knowledge of experts Output: geo-spatially dense (raster) {O k } Unknown: selection of methods & workflow of operations/methods & parameters of methods & relationships of auxiliary variables w.r.t O k & quantitative metric of output goodness p 1j Interpolations Mathematical models p 2j V 1 & V 2 O1O1 Auxiliary Variables & Tacit Knowledge

6
6 Applied Problem DischargedRecharged Recharge and Discharge Rate Prediction Bedrock elevation Water table elevation

7
7 Interdisciplinary Objectives Ground Water (Hydrologic Science) View: Evaluation of Alternative Conceptual (implicit relationships) and Mathematical Models (explicit relationships) Accurate Prediction of Groundwater Recharge and Discharge Rates from Limited Number of Field Measurements Computer Science View: Computer-Assisted Learning to Assess Alternative Conceptual and Mathematical Models Optimization of Prediction Models From a Set of Geo-Spatially Sparse Point Measurements DIALOG

8
8 State-of-the-Art Results Limited Spatial Resolution and Accuracy Min. Grid: 805mX805m Recharge zoneNoisy pattern or weak R/D Discharge zone Discharge Recharge Uniform Grid: 80mX80m

9
9 Existing Software for Groundwater and Surface Water Modeling MODFLOW is a three-dimensional finite-difference ground-water model - freeware (2005) PEST - is software for model calibration, parameter estimation and predictive uncertainty analysis - freeware (2007); University of Queensland, Australia Precipitation-Runoff Modeling System (PRMS) – is deterministic, distributed-parameter modeling system developed to evaluate the impacts of various combinations of precipitation, climate, and land use on streamflow, sediment yields, and general basin hydrology - freeware (1996); USGS Deep Percolation Model (DPM) - facilitates estimation of ground-water recharge under a large range in climatic, landscape, and land-use and land-cover conditions USGS

10
10 Related Work Singh A. et al. Expert-Driven Perceptive Models for Reducing User Fatigue in an Interactive Hydrologic Model Calibration Framework Conductivity (K) and Hydraulic heads (H) for the hypothetical aquifer

11
11 Motivation Ground Water (Hydrologic) Science: Currently, there is no single method that could estimate R/D rates and patterns for all practical applications. Therefore, cross analyzing results from various estimation methods and related field information is likely to be superior than using only a single estimation method. Computer Science : It is currently impossible (a) to replace an expert with a lot of tacit domain knowledge by computer algorithms or (b) to learn by an expert new I/O relationships from a plethora of possible variables and an extremely large space of processing methods and their parameters Thus, assisting experts to discover, evaluate and validate new relationships in an iterative way will likely enable (a) better understanding of the underlying phenomena, and (b) more automated and cost-efficient predictions

12
12 Problems Addressed by Spatial Pattern To Learn

13
13 Our Approach Data-Driven Analyses to Test Alternative Models, and to Search the Space of Processing Operations and Their Parameters Interpolation methods Mathematical models Image processing algorithms Machine learning algorithms Scalability of algorithms with large size data Computer-Assisted Comparisons and Evaluations of Multiple Models and Sub-Optimal Solutions Model/Solution Representation Closed Loop (Iterative) Workflows Human Computer Interfaces Overall Approach: An Exploration Framework for a Class of Alternative Models/Hypotheses and Optimal Solutions

14
14 SP2Learn Problem Formulation Given a set of geo-spatially sparse field measurements and auxiliary variables, derive accurate, spatially dense, R/D rate map by (a) using physics-based model (b) incorporating boundary conditions and (c) exploring auxiliary variables representing prior knowledge about R/D patterns but missing in the physics-based model

15
15 Challenges (1) How to Recognize Meaningful Pattern of Predicted Map? (2) How to Quantify the Goodness of the Pattern? Approach: (1a) Recognize patterns by utilizing multiple image enhancement and segmentation techniques applied to R/D rate predictions (1b) Introduce relationship between R/D pattern and auxiliary (a priori reference) information (2a) Define goodness w.r.t. reference information using experts selection of meaningful relationships (2b) Define goodness w.r.t. reference information using complexity of machine learning

16
16 Using Physics-Based Model Incoming water Outgoing water Bed rock elevation + Water table elevation + Hydraulic conductivity + DischargedRecharged Field Measurements R/D Rate Prediction Ground water flux=hydraulic conductivity * cell area * gradient of water table elevation (head) over cell distance

17
17 Incorporating Spatial Boundary Conditions BC: R/D rate prediction could have smooth transitions and recharge & discharge regions (contiguous pixels) should be clearly delineated Approach: Apply Image Restoration and De-noising Techniques Moving average based low pass filter TVL (Total Variation regularized L 1 -norm function) based filter Morphological operation based filter Using multiple techniques multiple times DischargedRecharged

18
18 Exploring Auxiliary Variables Driving R/D Patterns moving average normalization+TVL moving average normalization+TVL Proximity to River: P(R or D area/River is close)~high Soil Type: P(R or D area/Soil=Clay)~low Slope: P(R or D area/ slope=high)~low Prior Tacit Knowledge about R/D and Auxiliary Variables

19
19 From Auxiliary Variables To Knowledge and Accurate R/D Load VariablesIntegrate Maps Define ROI Create Decision Tree Load R/D Map Apply Rules

20
20 SP2Learn Output A set of rules that define relationships between predicted (R/D rate) variable and auxiliary variables Modified (more accurate) predictions according to the user selected rules defining relationships of predicted and auxiliary variables Sensitivity analysis results with respect to Methods (interpolations, image enhancement, …) Models Parameters

21
21 Example Results Elevation is not in { } AND Soil type is in {Rm=Roscommon muck} AND Proximity to water body is not {near_water} AND Slope is in {0-0.9} R/D rate is , =+ ROI

22
22 SP2Learn Architecture and Functionality

23
23 Underlying SP2Learn Technology

24
24 SP2Learn Functionality Overview Load Raster Step Integration Step Rules Step Attribute Selection Step Apply Rule Step Create Mask Step

25
25 SP2Learn Workflow

26
26 On-Line Help

27
27 Software and Test Data Download Download web page of Image Spatial Data Analysis group at NCSA:

28
28 Running SP2Learn

29
29 Input Data to SP2Learn Raster files (maps) Predicted R/D rate models Auxiliary variables For mask creation Tables with geo-points Vector files with boundaries Raster files of categorical or continuous variables

30
30 Image Processing Filtering Methods Low pass (moving average) filters Morphological filters TVL1 (Total Variation regularized L1 function) Using multiple techniques multiple times Parameters Kernel size (row dimension, column dimension)

31
31 Example Input Maps Morphological Opening Kernel = (10,10) Kernel = (5,5) Morphological Closing Kernel = (5,5) Kernel = (10,10) Low Pass Filter Kernel = (5,5) Kernel = (10,10)

32
32 Example Auxiliary Maps Slope DEM Soil River Stream

33
33 Loading Files Load R/D rate models (maps) Load auxiliary maps to explore alternative models Proximity to water Soil type Slope …

34
34 Mosaic Maps Large spatial coverage – a set of tiles Out-of-core representation

35
35 Viewing Images Right mouse click Image information Zoom Check boxes Pseudo-color Auto-fit images

36
36 Registration Integration of all maps (raster images) to a common projection and spatial resolution Before Convert After Convert

37
37 Create Mask A B C Mask Operations Mask Parameters Visualization Panel

38
38 Mask Creation Options in SP2Learn

39
39 User Defined Mask Creation Set Parameter: User defined Mouse click-and- drag selection of region Click Paint and Show Click Apply

40
40 Label Editor Assign categorical labels to colors

41
41 Attribute Selection Output: Predicted Variable Input: Auxiliary Variables Check-boxes Show Table Prune Tree

42
42 Decision Tree Based Modeling Tree structure can be represented as a set of rules Discharge yes no yes no RechargeDischarge Case A.. Case E..Case J.. Distance from river 100 ft? Soil Type is {sand}?

43
43 Rules from Decision Tree Num: Node number in a decision tree. Support(%): Among all cases satisfying conditions, the ratio of cases having the same class (conclusion). # of cases: The number of cases satisfying conditions Class: Conclusion of a rule Conditions: Conditions of a rule MDL Score: MDL score of a decision tree. The less the score is, the better the tree is

44
44 Show Decision Tree Show Tree Option

45
45 Export Rules XML format Export Rules Option

46
46 Apply Rules Visualization of Modified output variable Changed pixels Magnitude of changes (differences)

47
47 Summary Novel Frameworks and Methodologies for Exploratory Data-Driven Modeling and Scientific Discoveries Problems addressed in the prototype SP2Learn solution: Prediction accuracy improvement by a combination of mathematical models and data- driven (knowledge based) models, supervised and unsupervised iterative model optimization Better Data Utilization!

48
48 Extra Information A stack of informatics and cyber-infrastructure software is open source Other software of potential interest: GeoLearn is an exploratory framework for extracting information and knowledge from remote sensing imagery CyberIntegrator to support creation of exploratory workflows, reuse of workflows, remote server execution, data and process provenance tracking and analysis, streaming data support Image Provenance to Learn (IP2Learn) to support decision processes based on visual inspection of images Load Estimation (work in progress) to support optimal sampling of sediment loads using several sediment-discharge rating curves, bias correction factors and Monte Carlo simulations to predict confidence limits Download web page of Image Spatial Data Analysis group at NCSA:

49
49 Acknowledgement Funding Agencies: NASA, NARA, NSF, NIH, NAVY, DARPA, ONR, NCSA Industrial Partners, NCSA Internal, COM UIUC, State of Illinois Full Time Employees: Peter Bajcsy, Rob Kooper, Sang-Chul Lee, Luigi Marini Students: Shadi Ashnai, Melvin Casares, Miles Johnson, Chulyun Kim, Qi Li, Tim Nee, Arlex Torres, Ryo Kondo, Henrik Lomotan, James Rapp Collaborators: College of Applied Health Sciences UIUC, Kinesiology Dept. UIUC, CEE UIUC, CS UIUC, GISLIS UIUC UIC, UC Berkeley, Univ. of Texas at Austin, Univ. of Iowa ISWS, NARA, Nielsen, State Farm Instituto Tecnológico de Costa Rica, UNESCO-IHE Netherlands Instituto Tecnológico de Costa Rica

50
50 Thank you! Questions: Peter Bajcsy Need More Details Publications:

51
51 Backup

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google