Presentation on theme: "1 Business Intelligence and the Statistician: Tutorial EPA Statistics Users Group Meeting Brand Niemann Senior Enterprise Architect U.S. EPA April 28,"— Presentation transcript:
1 Business Intelligence and the Statistician: Tutorial EPA Statistics Users Group Meeting Brand Niemann Senior Enterprise Architect U.S. EPA April 28, 2010 http://semanticommunity.net/ Note: This is also my 2010 Individual Development Plan.
2 Overview 1. What is it? 2. How does it affect the statistician? 3. How is it useful? 4. Who in EPA is using it? 5. Where do I find our more? 6. What is Wolfram Alpha?
3 1. What is it? EPA Statistics Training teaches one how to interpret environmental data and do Exploratory Data Analysis (e.g. S-PLUS).S-PLUS –See epadata.wik.is - Interpretation of Environmental Statistics.Interpretation of Environmental Statistics State-of-the-art statistical tools have evolved from Exploratory Data Analysis to comprehensive Business Intelligence Assessments (e.g. Spotfire).Spotfire –See Wikpedia – Business Intelligence.Business Intelligence State-of-the-art statistical techniques and tools are needed to provide quantitative data quality assessments and business intelligence results for Data.gov.Data.gov –Data Quality is especially important to using the Data.gov/semantic for Linking Open Data.Linking Open Data
4 2. How does it affect the statistician? S-PLUS: –Started with Statistical Sciences (1988) and ended up with TIBCO as Spotfire (2008). Spotfire is a business intelligence company whose origins trace back to the Human-Computer Interaction Laboratory at the University of Maryland, College Park, MD, in the early 1990s that was bought by TIBCO in 2007. Interpretation of Environmental Statistics: –OEI training on how simple pictures, graphs, and summaries can be used to interpret data (without equations and computers). Exploratory Data Analysis: –An approach to analyzing data for the purpose of formulating hypotheses worth testing named by John Tukey. Business Intelligence: –Computer-based techniques used in spotting, digging-out, and analyzing business data to support better business decision- making.
5 3. How is it useful? Data.gov –Increase public access to high value, machine readable datasets generated by the executive branch of the federal government. Data.gov/semantic –Implement the principles of Linking Open Data to Data.gov. Linking Open Data: –Tim Berners-Lee outlined four principles paraphrased as follows: Use URIs (like URLs) to identify things.URI Use HTTP URIs so that these things can be referred to and looked up ("dereference") by people and user agents. Provide useful information (i.e., a structured description — metadata) about the thing when its URI is dereferenced. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.
6 4. Who in EPA is using it? EPA Business Intelligence and Analytics Center, Tim Hinds, Manager (Intranet Site):Intranet Site –EPA provides business intell, analytics tools in SaaS model, FCW, June 18, 2009 (reports as PDF or Web pages). FCW Oracle Business Intelligence Enterprise Edition, Business Objects XI (discontinued use), Informatica PowerCenter, and SAS.Oracle Business Intelligence Enterprise Edition –Recent Oracle Business Intelligence Workshop: Most use Excel, but we think Oracle can do more. Put Your Desktop in the Cloud to Support the Open Government Directive and Data.gov/semantic, Brand Niemann, Senior Enterprise Architect: –Apply to EPA Statistics Training – Interpretation of Environmental Data as a pilot. See next slides of screen captures. –Complete the statistical data stories for the 2008 ROE Indicators and other high-value data sets. See next slides of screen captures.
7 4. Who in EPA is using it? EPA Environmental Statistics Training Histogram of Data Rounded to Nearest Tenth Normal Distribution Parameters Table of Contents
8 4. Who in EPA is using it? Spotfire Help: Curve Fit Models Spotfire on the Web: Non-interactive
9 4. Who in EPA is using it? EPA Ontology Ozone Concentrations Indicator
10 4. Who in EPA is using it? Spotfire Help: ScatterplotsSpotfire on the Web: Non-interactive
11 5. Where do I find our more? EPA Environmental Statistics Training: –http://epadata.wik.is/Statistics_Users_Group/Interpretation_of_Environmental_Datahttp://epadata.wik.is/Statistics_Users_Group/Interpretation_of_Environmental_Data Histogram of Data Rounded to Nearest Tenth: –http://epadata.wik.is/Statistics_Users_Group/Environmental_Statistics_Training/Interpretation _of_Environmental_Data#Histogram_of_Data_Rounded_to_Nearest_Tenthhttp://epadata.wik.is/Statistics_Users_Group/Environmental_Statistics_Training/Interpretation _of_Environmental_Data#Histogram_of_Data_Rounded_to_Nearest_Tenth Spotfire Help: Curve Fit Models: –http://stn.spotfire.com/spotfire_client_help/curve/curve_curve_fit_models.htmhttp://stn.spotfire.com/spotfire_client_help/curve/curve_curve_fit_models.htm Spotfire on the Web: Noninteractive: –http://semanticommunity.net/EPAStatistics/IntrepretationofEnvironmentalData-Spotfire.htmlhttp://semanticommunity.net/EPAStatistics/IntrepretationofEnvironmentalData-Spotfire.html EPA Ontology: –http://epaontology.wik.is/http://epaontology.wik.is/ 2008 ROE Ambient Concentrations of Ozone: –http://epaontology.wik.is/2_Air/2.2_What_Are_the_Trends_in_Outdoor_Air_Quality_and_The ir_Effects_on_Human_Health_and_the_Environment%3f/2.2.2_ROE_Indicators/184.108.40.206_Amb ient_Concentrations_of_Ozonehttp://epaontology.wik.is/2_Air/2.2_What_Are_the_Trends_in_Outdoor_Air_Quality_and_The ir_Effects_on_Human_Health_and_the_Environment%3f/2.2.2_ROE_Indicators/220.127.116.11_Amb ient_Concentrations_of_Ozone Spotfire Help: Scatterplots: –http://stn.spotfire.com/spotfire_client_help/scat/scat_what_is_a_scatter_plot.htmhttp://stn.spotfire.com/spotfire_client_help/scat/scat_what_is_a_scatter_plot.htm Spotfire on the Web: Noninteractive: –http://semanticommunity.net/EPAStatistics/ROEAirIndicators.htmlhttp://semanticommunity.net/EPAStatistics/ROEAirIndicators.html Spotfire Web Player (Makes Spotfire on the Web Interactive): –http://spotfire.tibco.com/products/web-player/interactive-dashboards.aspxhttp://spotfire.tibco.com/products/web-player/interactive-dashboards.aspx
12 5. Where do I find our more? Seven Basic Tools of Quality:Seven Basic Tools of Quality –A designation given to a fixed set of graphical techniques identified as being most helpful in troubleshooting issues related to quality. They are called basic because they are suitable for people with little formal training in statistics and because they can be used to solve the vast majority of quality-related issues. –The tools are: The cause-and-effect or Ishikawa (fishbone) diagram (Online)Ishikawa (fishbone) diagram Online The check sheet (Spreadsheet)check sheet Spreadsheet The control chart (Spotfire)control chart Spotfire The histogram (Spotfire)histogramSpotfire The pareto chart (Spreadsheet)pareto chartSpreadsheet The scatter diagram (Spotfire)scatter diagram Spotfire Stratification (alternately flow chart or run chart) (Visio)Stratificationflow chart run chartVisio
13 6. What is Wolfram Alpha? Computational Knowledge Engine Answer Source Information
14 6. What is Wolfram Alpha? Wolfram Alpha: –http://www.wolframalpha.com/http://www.wolframalpha.com/ How long is the US Coastline?: –http://www.wolframalpha.com/input/?i=us+coastlinehttp://www.wolframalpha.com/input/?i=us+coastline –Answer: 12380 miles Source: https://www.cia.gov/library/publications/the-world-factbook/https://www.cia.gov/library/publications/the-world-factbook/ Source: https://www.cia.gov/library/publications/the-world- factbook/geos/us.htmlhttps://www.cia.gov/library/publications/the-world- factbook/geos/us.html iPhone App: –http://products.wolframalpha.com/iphone/http://products.wolframalpha.com/iphone/ General Reference: –http://en.wikipedia.org/wiki/Wolfram_Alphahttp://en.wikipedia.org/wiki/Wolfram_Alpha