Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carolina Environmental Program UNC Chapel Hill The Analysis Engine – A New Tool for Model Evaluation, Sensitivity and Uncertainty Analysis, and more… Alison.

Similar presentations


Presentation on theme: "Carolina Environmental Program UNC Chapel Hill The Analysis Engine – A New Tool for Model Evaluation, Sensitivity and Uncertainty Analysis, and more… Alison."— Presentation transcript:

1 Carolina Environmental Program UNC Chapel Hill The Analysis Engine – A New Tool for Model Evaluation, Sensitivity and Uncertainty Analysis, and more… Alison M. Eyth, Prashant P. Pai Carolina Environmental Program University of North Carolina at Chapel Hill October 19, 2004

2 Carolina Environmental Program UNC Chapel Hill Background l Supports data analysis by creating plots and tables l “Analysis Configurations” facilitate repeated analyses l Developed as part of the Multimedia Integrated Modeling System (but can be used standalone) l Java application that runs on Windows, Linux, Unix, and Mac OS X l Open source – available from http://sourceforge.net/projects/mimsfw http://sourceforge.net/projects/mimsfw l Three main components: – Table application – Plotting engine – Statistics package

3 Carolina Environmental Program UNC Chapel Hill Table Application l Provides the top level user interface l File menu accesses import and export functions l Currently supported file formats include: – Comma separated (.csv), Custom and tab delimited, Fixed column width, SMOKE Report, and ARFF (to support data mining with WEKA) l Data files are imported as rows and columns l Multiple files can be loaded, with each file shown in its own tab – tabs include the file name, header, data table, and footer l Toolbar and pop up (i.e. right click) menus provide access to functions such as sort, filter, top N rows, format, plot, and statistics

4 Carolina Environmental Program UNC Chapel Hill Table Application GUI

5 Carolina Environmental Program UNC Chapel Hill Toolbar and Pop-up Menu Functions l Multi-column sort l Show only the rows with the Top N values l Show only the rows with the Bottom N values l Filter rows based on criteria (e.g. NOx > 500) l Show / hide columns l Format columns – e.g. color, width, font, number or date style l Create plots l Compute statistics l Edit analysis configuration l Reset

6 Carolina Environmental Program UNC Chapel Hill Filter Rows Dialog l Use Filter Rows to limit the rows shown in the table l Any number of criteria can be added l Each criterion has a column, operation, and value l Available operations are, >=, not =, starts with, contains, ends with, does not start with, does not contain,... l Select between showing rows matching ALL criteria or ANY

7 Carolina Environmental Program UNC Chapel Hill Plotting Options Dialog l Choose Plot type from Bar, Box, CDF, Discrete Category, Histogram, Rank Order, XY (Scatter), Line, Time Series, and Tornado l Select Data Columns to plot l Specify Units and one to three columns to use for labels l Selected data is passed to the plotting engine

8 Carolina Environmental Program UNC Chapel Hill Plot Properties are Specified using the Analysis Engine GUI

9 Carolina Environmental Program UNC Chapel Hill Example Discrete Category Plot Note: Plots are created using a custom Java interface to R

10 Carolina Environmental Program UNC Chapel Hill Statistics Dialog l Provides interface to the statistics package l Specify statistics to compute and data columns to analyze l Additional details are specified on other tabs l Statistics outputs appear as new tabs in the table application l Statistics are computed using Colt and Weka

11 Carolina Environmental Program UNC Chapel Hill Example of Histogram Statistics Note: This is a new tab that supports all the standard functions such as sort, filter, format, and plot

12 Carolina Environmental Program UNC Chapel Hill Analysis Configuration Dialog l The Analysis Configuration stores all the table settings and plots that you have created during your session l The selected plots can be viewed, edited or deleted l Plots can be given new names by double clicking the name l Some (or all) of the settings can be saved to a configuration file l Configuration files can be loaded in future sessions or for other data files in the current session

13 Carolina Environmental Program UNC Chapel Hill Automation l An optional command line interface may be used specify: – Data files to load – Analysis configuration file to use – Type of plots to create (e.g., JPG, PDF, PNG) – Output directory for plots and tables l This allows plots and tables to be created in an automated fashion l Standard analysis products may be created for newly available data sets that have the same format

14 Carolina Environmental Program UNC Chapel Hill Examples of Potential Applications l Model Evaluation – Sort to find stations at which the error was the largest – Plot modeled and observed values on box plots, etc. – Create scatter plots of one species vs. another l Sensitivity and Uncertainty Analysis – Perform linear regression and show in plots and tables – Compute correlation coefficients l Emissions Modeling Quality Assurance – Find states with top 10 emission values – Stacked bar charts to show total emissions – Compute histograms l General Data Analysis – Analyze data by sorting, filtering, and computing statistics

15 Carolina Environmental Program UNC Chapel Hill Future Directions l Initial version will be released on SourceForge by 10/31/04 (which is the end date for the current funding for this work) l Many potential enhancements are listed on SourceForge, e.g.: – Create new rows and columns using functions (e.g difference, sum) – Create plots and tables with data from multiple tabs l Will likely be used as part of the new emissions quality assurance tool (http://sourceforge.net/projects/emisview)http://sourceforge.net/projects/emisview l Mr. Tommy Cathey will continue to develop the custom Java interface to R at the EPA Scientific Visualization Laboratory in FY05

16 Carolina Environmental Program UNC Chapel Hill References l MIMS Sourceforge page (for downloads): http://sourceforge.net/projects/mimsfw http://sourceforge.net/projects/mimsfw l R (for plots): http://www.r-project.orghttp://www.r-project.org l Colt (for basic statistics): http://www-itg.lbl.gov/~hoschek/colt http://www-itg.lbl.gov/~hoschek/colt l Weka (for regression and correlation analysis): http://www.cs.waikato.ac.nz/~ml/weka/ http://www.cs.waikato.ac.nz/~ml/weka/ l Carolina Environmental Program (for more information): http://www.cep.unc.eduhttp://www.cep.unc.edu l Authors: eyth@unc.edu, prapai@unc.edueyth@unc.eduprapai@unc.edu


Download ppt "Carolina Environmental Program UNC Chapel Hill The Analysis Engine – A New Tool for Model Evaluation, Sensitivity and Uncertainty Analysis, and more… Alison."

Similar presentations


Ads by Google