FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 1 Cornell University’s.

Slides:



Advertisements
Similar presentations
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Advertisements

Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
JTX Overview Overview of Job Tracking for ArcGIS (JTX)
Donnie Hamlett Technology Specialist Microsoft Corporation Microsoft Services for NetWare 5.0 Overview Overview Directory Synchronization Services Directory.
Module 12: Auditing SQL Server Environments
Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
Oracle9i Database Administrator: Implementation and Administration 1 Chapter 2 Overview of Database Administrator (DBA) Tools.
Validata Release Coordinator Accelerated application delivery through automated end-to-end release management.
Manageware For Documentum ESI SOFTWARE 2006
Components and Architecture CS 543 – Data Warehousing.
10/7/1999Database Management -- R. Larson Database Administration: Additional Issues University of California, Berkeley School of Information Management.
Data Warehouse success depends on metadata
Oracle Database Administration. Rana Almurshed 2 course objective After completing this course you should be able to: install, create and administrate.
Chapter 1 Introduction to Databases
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
ETL By Dr. Gabriel.
Database Design and Introduction to SQL
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
D ATABASE A DMINISTRATION ITEC 450 Fall 2012 Instructor: Dr. Rama Gudhe.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
DATABASE ADMINISTRATION WHAT IS IT?. THE GIST Database administrators are responsible for creating and maintaining the databases that form the core of.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
ETL Overview February 24, DS User Group - ETL - February ETL Overview “ETL is the heart and soul of business intelligence (BI).” -- TDWI ETL.
Informix IDS Administration with the New Server Studio 4.0 By Lester Knutsen My experience with the beta of Server Studio and the new Informix database.
1099 Why Use InterBase? Bill Todd The Database Group, Inc.
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
Right In Time Presented By: Maria Baron Written By: Rajesh Gadodia
Database A database is a collection of data organized to meet users’ needs. In this section: Database Structure Database Tools Industrial Databases Concepts.
The Oracle9i Multi-Terabyte Data Warehouse Jeff Parker Manager Data Warehouse Development Amazon.com Session id:
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Some Cool Tools for the PeopleSoft Support Team Session #20649 March 13, 2006 Alliance 2006 Conference Nashville, Tennessee.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
DatabaseCSIE NUK1 Fundamentals of Database Systems Chapter 1 Database and Database Users.
Enterprise Data Warehousing— Planning for the Long Haul Vicky Shaffer and Marti Graham April 18, 2005.
Transportation: Refreshing Warehouse Data Chapter 13.
7 Strategies for Extracting, Transforming, and Loading.
By N.Gopinath AP/CSE.  The data warehouse architecture is based on a relational database management system server that functions as the central repository.
Database Administration Basics. Basic Concepts and Definitions  Data Facts that can be recorded and stored  Metadata Data that describes properties.
Platinum DecisionBase1 DW Product Platinum - Computer AssociatesDecisionBase Hyunsook Lim Database Laboratory Dept. of CSE.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Copyright © 2006, Oracle. All rights reserved. Czinkóczki László oktató Using the Oracle Warehouse Builder.
C Copyright © 2007, Oracle. All rights reserved. Introduction to Data Warehousing Fundamentals.
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
1 Copyright © 2007, Oracle. All rights reserved. Installing and Setting Up the Warehouse Builder Environment.
Slide 1 © 2016, Lera Technologies. All Rights Reserved. Oracle Data Integrator By Lera Technologies.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
I/Watch™ Weekly Sales Conference Call Presentation (See next slide for dial-in details) Andrew May Technical Product Manager Dax French Product Specialist.
Planning a Migration.
ETL Design - Stage Philip Noakes May 9, 2015.
Smarter Technology for Better Business
Designing and Implementing an ETL Framework
Advanced Applied IT for Business 2
Data and Applications Security Developments and Directions
Introduction to transactional replication
Oracle Database Administration
Maximum Availability Architecture Enterprise Technology Centre.
Informix Red Brick Warehouse 5.1
Michael Mast Senior Architect
Introduction of Week 6 Assignment Discussion
Database Environment Transparencies
Data and Applications Security Developments and Directions
Data Warehousing Concepts
Data and Applications Security Developments and Directions
Best Practices in Higher Education Student Data Warehousing Forum
Resources.
Presentation transcript:

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 1 Cornell University’s Data Warehousing Infrastructure Presented by: Jeff Christen Data Warehousing DBA – Team Lead

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 2 Jeff Christen DBA Lead - Data Warehousing –Ten Years experience as a DBA (Oracle & Informix) –Last Four Years Focused on Data Warehouse DBA Support Team Interface to other IT groups Team Interface to University

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 3 Responsibilities of a DW DBA Production Support – 24x7 (Database & Load) Performance Monitoring & Tuning Database Backup & Recovery Security Implementation Object & Code Migrations (Dev / Test / Prod) Infrastructure Development & Maintenance Enforcement of University Policies & Practices Integration of Models & ETL into Warehouse Environment Assist in Data Modeling & ETL Development

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 4 Cornell’s Warehousing Challenges Twelve production data marts Twenty-five unique loads Variety of sources(mainframe, PeopleSoft) Varied load frequencies(daily, weekly, monthly) Varied load requirements(full, partial, append) Rapidly shrinking load windows Multiple server & O.S. environment

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 5 DMTools as a Solution DMTools is a Data Warehousing infrastructure management tool developed and in use by Cornell University. Allows high data availability- 24x7 access Repository driven Manages loads Toolbox written in Oracle PL/SQL (O.S. independent) GUI console to manage load related metadata

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 6 High Data Availability Table Renaming Process Two copies of each data mart table are maintained A current table and a backup or work table Data mart users only “see” the current table New data is loaded into “work” table Rename tables –rename person to person_b –rename person_w to person

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 7 High Data Availability Table Renaming Benefits Instant access to new data Rename does not interrupt users “Backup” table for previous load’s data Ability to instantly roll back load

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 8 High Data Availability Table Renaming Example

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 9 High Data Availability Partition Exchange Similar to table rename May exchange table partitions without disrupting users Maintain full partitioned copy of table & small non- partitioned copy Data loaded into non-partitioned table, then exchanged with appropriate partition

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 10 High Data Availability Partition Exchange Example

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 11 High Data Availability Additional Considerations Move security between table copies Remove / add policy to table during exchange Constraint management Manage index names between table copies (A or B suffix to guarantee unique names) Step verification & error handling

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 12 Production AIX Server Transformation & Staging Current Warehousing Environment Infrastructure Production AIX Server Consolidated DataMarts Production AIX Server DMTools Repository Replication Database (PeopleSoft) Production Windows Server H/R & Payroll Production Windows Server Contributor Relations Production Windows Server Student Administration Logging & Metadata Load (pull) from Replication Database Load (pull) from Replication Database Load (push) from Staging Database

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 13 DMTools Repository Data Model DM_RUN_DETAILS DM_INDEXES DM_NAMES DM_TAB_PART_COUNTS DM_GRANTS DM_ROW_COUNTS DM_RUNS DM_CONSTRAINTS DM_TABLES DM_CONS_COLUMNS DM_POLICIES DM_IND_PARTITIONS DM_IND_COLUMNS DM_ADMIN_USERS DM_ _CONTENT DM_ _SUBSCRIPTIONS

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 14 DMTools Repository Data Model (System Catalog)

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 15 DMTools Repository Data Model (Logging Metrics)

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 16 DMTools Repository Resides in a centralized database Contains metadata needed for load process Contains logging data related to load process & load metrics Holds notification information Scalable(may use for multiple data marts)

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 17 Load Logging High level logging –Start & end times for pre load, load, post load –Load status( completed, running, failed ) –Snapshot date –Estimated completion time –Record counts(by table & partition) Low level logging –Start /end time & description for every action –Captures Oracle errors & SQL executing

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 18 Load Status Monitoring Predefined views –Status views(high level load info for given DM) –Detail views(used by DBA for trouble shooting) –Load metrics views (various object counts related to loads) –Flexibility to build additional views on logging tables as needed

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 19 DMTools Toolbox Set of Oracle stored procedures used to perform common load tasks Resides on each database in the DMTools environment Procedures bundled in packages related to their function –dmrunlog –dmidxpack …(partial list only)…

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 20 Main Procedure Unique to each data mart Not owned by DMTools(data mart schema) Templates for easy starting point Modify template to accommodate Preload, load, post load

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 21 DMTools Console GUI interface to the repository Metadata maintenance –Adding/removing tables, indexes, grants, etc. –Migration management /pager subscription & notification maintenance Warnings & reports –Detect potential load problems prior to load

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 22 DMTools Console - GUI Interface (main)

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 23 DMTools Console - GUI Interface (add Table)

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 24 DMTools Advantages High data availability (24x7 access) Instant rollback to pre-load state Centralized logging and notification Metadata driven O.S. independent (Oracle RDBMS only) Works with various ETL tools Simple setup and maintenance Very scaleable

FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 25 Cornell University’s Data Warehousing Infrastructure INTEREST ? // QUESTIONS ? Presented by: Jeff Christen Data Warehousing DBA – Team Lead