Presentation on theme: "DRAFT Positioning Data Discovery for Greater Impact October 2014."— Presentation transcript:
DRAFT Positioning Data Discovery for Greater Impact October 2014
DRAFT Agenda Department of Public Welfare Data Analytics Landscape Positioning Endeca Enablement Highlights and Outcomes Future Roadmap Questions & Discussion
DRAFT Department of Public Welfare
DRAFT EDW - Landscape Technology: Cognos 10.2, Informatica 9, Oracle 11G Office Income Maintenance (OIM) Pennsylvania Insurance Department (PID) EDW DW Pennsylvania Department of Education (PDE) DW Office of Medical Assistance Program (OMAP) Office of Children, Youth and Families (OCYF) Office of Child Development & Early Learning (OCDEL) Office of Developmental Programs (ODP) Office of Long Term Living (OLTL) Office of Medical Assistance Program (OMAP) Office of Mental Health and Substance Abuse Services (OMHSAS) Technology: Cognos 7, Decision Stream, Oracle 10G Technology: Cognos, Informatica, Oracle 10G PIMS Bridge (OCDEL-PDE) Enterprise Incident Management ODP OLTL
DRAFT Investment in Information Management Stage 2: Stage 3: Stage 4: Stage 1: What might happen? Static Reporting Static Reporting Business Intelligence Analytics Business Intelligence Analytics Advanced Analytics Data Gathering Strategic impact What is available? Pre-defined Reporting: Prompt reports Scheduled reporting Ad hoc capabilities: Self service reports OLAP cubes Monitoring KPIs: Dashboards Scorecards Predictive Analytics: Incident prediction Financial forecasting Service effectiveness Fraud detection and prevention Mobile Analytics: Alerts On-the-go Metrics Why is it happening? What is happening? Business Analytics Capabilities Positioning Data Discovery
DRAFT Key Drivers for Data Discovery ChallengesDetails Data Tsunami & Unpredictability Critical data is being collected at an unprecedented scale from varied sources driving up analytics complexities Data volumes and integration efforts are roadblocks to insights Value Proposition Value of data explodes when it is linked with other data for correlations Collecting detailed recipient service measures ensures quantifying program impact Greater Community Impact Goal being to provide a better quality of life to each person in a shorten timeframe. Help propagate the design of high impact programs for current and future recipients Drive actions for prevention of child abuse 360-degree view of clients, families and providers to supplement the department's mission
DRAFT Oracle Endeca Consulting In-memory architecture and innovative caching deliver extreme performance Powerful text analytics extracts key themes and sentiments Support for sentiment analysis in 10 languages, localization in 13, and search and self- service term extraction in 33+ enable truly global analytics Sophisticated data integration and ETL streamline access to enterprise sources, including Oracle Business Intelligence Agile, data-driven approach requires no up-front modeling, for fast time to value Deep Text Analysis Enterprise Data Discovery In Memory Analytics Robust Data Integration Oracle Endeca Self Service Discovery Easily create, configure, and securely share discovery applications within the context of enterprise governance and security Upload information from a wide array of self service sources including Excel, JSON, and any data source accessible via JDBC State-of-the-art search and guided navigation surface insights with a click Live data enrichment allows users to enhance analytics in the moment Endeca is a complete solution for agile data discovery across the enterprise, empowering business user independence in balance with IT governance. Endeca offers fast, intuitive access to both traditional analytic data and non-traditional data, including external and unstructured information.
DRAFT Fragmented Source Compilation Web Sentiment Analysis Self-Service Enablement Transactional/Stage Data Discovery Unstructured/Semi- Structured Data Analysis Endeca allows for the ingestion of unstructured and semi-structured data and provides analytics capabilities to uncover hidden trends and details Endeca allows for applications to be created directly on source and stage data which help Program Office Business Analyst’s slice and dice information to uncover previously un-realized questions to complement enterprise reporting requirements Endeca allows for rapidly assimilation of data from multiple sources to garner an executive view of the data from across multiple data stores Its capability for Program Office Business Analyst’s to upload diverse data for snapshot analysis with minimal dependence on IT for basic data setup and support Ability to setup web crawls for gathering data and provision online sentiment analysis which could potentially lead to drawing correlations with enterprise data Value Proposition & Applications
DRAFT Perceived Benefits 1.Fragmented Source Compilation Combining EDW, OCYF, and CY48 data allowed program offices to drill into causes for heightened days for investigation and expose potential reasons for bottlenecks OIM compilation of demographics, census, and CQCCOM service information helped draw a holistic view of the recipients 2. Advanced Analytics Sentiment analysis of structured and un-structured data which includes whitelist tagging and text extractions, alongside spreadsheet consumption and visualization Built-in mapping and advanced visualization engines like tag clouds and capabilities for negative refinements 3. Data Validations Provisioning access to view data captured by OCYF enabled a window into potential future enterprise reporting needs Access to previously unavailable SAMS, eCIS, and HCSIS transaction data 4. Delivery Cycles Typical delivery cycle for an Endeca project is weeks with a week update cycle based on end-user feedback for required enhancements 2-4 week cycles for applications built using self- service for a quick window into the data
DRAFT DPW Enablement Objectives The objectives being targeted with the initial 25 user enablement: Enterprise-wide Adoption Uncover the potential landscape for the application of Endeca within the department Determine use and adoption of Endeca and the concept of data discovery across program offices Concept Positioning Build the utilization of the complete set of Endeca’s standard capabilities Blend its use within the existing Business Intelligence/Data Analytics Landscape Solution Scalability Determine factors to be considered during deployment within the Enterprise for a significant user-base Document governance for People, Process, and Technology considerations encompassing rollouts
DRAFT FY 2013 – Dec FY 2014 – Jan FY 2014 – Feb FY 2014 – Mar FY 2014 – Apr FY 2014 – May FY 2014 – Jun FY 2014 – Jul FY 2014 – Aug FY 2014 – Sept FY 2014 – Oct Basic Install, capability demonstration and Self- Service Enablement Wave 1 (Initiation) Configurations, assessment and initiate attempts to build end-user content for program offices Wave 2 Gain targeted adoption and consensus for an enterprise rollout Wave 3 Implementation Timeline Basic Install, capability demonstration and Self-Service Enablement Wave 1 (Initiation) Configurations, assessments and initiate attempts to build end-user content for program offices Wave 2 Gain targeted adoption and consensus for an enterprise rollout Wave 3 Wave 1 Lessons Learned Phase 1Phase 2 Executive Touch Points Timelines & Targets
DRAFT 15 The development of the self service applications for the program areas resulted in common themes across the program offices. Findings – Data/Application Rendition Allows for rendering previously unavailable data for mining and analysis Provides access to unstructured and fragmented data Allows for the ability to include traditional and non- traditional sources Gaps and limitations that warrants governance through maintenance cycles Benefits Exposed fraudulent activity to drive cost savings Exposed issues with data quality and corresponding business analysis implications Showed previously unknown information and sentiments captured within comments Shortened build cycles of 2-4 weeks for demos/POCs Accelerated end user delivery of feedback and enhancements Ability to decide if POC should be developed into ongoing report 8-12 week production application delivery alongside total week window for incorporating end-user driven enhancements Key Findings and Benefits
DRAFT Advanced visualizations like geo-spatial maps allowed for a simplified user-experience in uncovering insights Advanced Visualization & Data Mashup Data Mashups allowed for merging and drawing comparisons across internal & external data sources
DRAFT Negative Refinement Review of SNAP transactions for the month. Appears most transactions occur within our state. What happens if we remove PA and border states?
DRAFT Negative Refinement (cont.) Information appears that we may not have known. We see transactions occurring outside of PA and bordering states. Opportunity for further evaluation and discovery on that information.
DRAFT Capabilities for Tag Cloud highlights and Summarizations drive Advanced Analytics Advanced Analytics Ability to house vast amounts of data within domains propagated “big data” mining and exploration Ability to quickly perform a ‘negative’ refinement. Remove the big number to see what remains and may discover new unknowns.
DRAFT Big Data Mining Ability to house roughly 100 million records within a single domain provisioned capabilities to mine otherwise unusable data resulting in fraud prevention and summarized reporting
DRAFT 21 While the current applications were created by IT, there is an ability to transition development to program office users based on the vision of the rollout. Future Vision End-User/Program Office Driven Self-Service 50% - 50% BIS/ IT Driven Self-Service Program Office/ End-User Driven Self-Service (10% Utilization) Technical/Super users within program office currently driven to utilize capability Limited time/effort availability and tool or conceptual knowledge gaps Challenges with utilizing self-service capabilities Considerations to increment adoption: Endeca training (Train the trainer), Identify program office FTEs developers, Re-use content across applications IT Supported Self-Service (90% Utilization) Conducted initial conversations with program offices for insights into challenges with data availability and analysis Built out drafts to highlight possibilities leveraging Endeca Follow-up sessions with program office stakeholders to finalize application layouts and drive long-term value Governance for environment stability and functionality deliverance Considerations for decreasing involvement: Involvement just with alleviating roadblocks, Augmenting re-usable content (e.g. Blacklists) Current State & Future Vision
DRAFT 22 Long Term Concept Positioning IT/BIS Supported Self-Service Program Office/ End-User Driven Self- Service Future Governed use of Self- Service for snapshot analysis Automated Endeca Production Applications for Regular Use Uncover use-cases/KPIs for Enterprise Reporting through Cognos
DRAFT Collaborative Project Delivery
DRAFT 25 Future High Level roadmap 2. Expand Deployment to Program Areas for Self Service Apps Today Tomorrow 3. Deployed to All Program Offices; 100+ Users 4. Scale Users & Data Volumes; Expand Self Service Apps Future 4a. IT Provisioned Applications to Program Offices 1. Production Pilot 25 NUP 5. Enterprise Wide Adoption
DRAFT 26 Current Configuration Current (Test & Development) Configuration Test/Dev Configuration Studio Server Endeca Server Integration Suite Server + Text Enrichment & Sentiment Analysis User count - Up to 25 users Server Configuration - Up to 4 cores - 8 GB RAM minimum 16GB+ recommended Server Configuration - Up to 8 cores - 64 GB RAM minimum 128GB+ recommended User count - Up to 25 users Server List
DRAFT Enterprise Configuration Server List Studio ServerEndeca Server Integrator Server OVM User count - Up to 100 users Exalytics Hardware Platform - 40 total cores -Hard partitioning allows you to only license what you need - 2 TB of RAM TB of Flash Disk Server Configuration -Up to 8 cores - 64GB RAM User count - Up to 100 users Server Configuration -Up to 8 cores GB RAM Server Configuration -Up to 24 cores - up to TB RAM Option 1 Perceived Future State
DRAFT Estimated Sizing Server List Pros Improved end user experience and productivity Efficiently leverage the power of Exalytics by licensing 100% of the server Cons No room on Exalytics for future growth Single points of failure at the Studio / Endeca Server tiers Potential Outcomes
DRAFT Integration Suite Server + Text Enrichment & Sentiment Analysis Endeca Server Cluster Node 1 Studio Server User count - up to 150 Users User count - up to 150 Users Server Configuration cores - 8 GB RAM minimum 64GB+ recommended Configuration - Up to 24 cores - Up to 2 TB of RAM TB of Flash Disk OVM Endeca Server Cluster Node 2 Perceived Future State Option 2
DRAFT Estimated Sizing Pros Clustered design removes single points of failure Enable Consistent, Stable, & Scalable Application Room to Grow on each server, supporting Future Growth Greater user adoption and experience High Availability for Business Continuity Cons Clustered design makes CPU studio pricing for unlimited users less attractive Enterprise Configuration Potential Outcomes