Download presentation
Presentation is loading. Please wait.
Published byDylan Lamb Modified over 9 years ago
1
MySQL Users Conf. 04-19-2005 MIT Lincoln Laboratory 1 Real-Time Sensor Data Warehouse Architecture Using MySQL Database Jacob Nikom MIT Lincoln Laboratory The MySQL Users Conference 2005 19 April 2005 This work was sponsored by the U.S. Army Space and Missile Defense Command under Air Force Contract# F19628-00-C-0002. Opinions, interpretations, recommendations and conclusions are that of the author and are not necessarily endorsed by the United States Government.
2
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM2 Outline Introduction Corporate Information Factory (CIF) and its Data Management Architecture (DMA) Designing ROCC DMA using CIF architecture Summary
3
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM3 Outline Introduction –Reagan Test Site (RTS) and its instrumentation –What is RTS Operations Coordination Center (ROCC)? –ROCC primary operations –ROCC logical component block diagram –ROCC modernization –New ROCC Data Management Architecture Corporate Information Factory (CIF) and its Data Management Architecture (DMA) Designing ROCC DMA based on CIF architecture Summary
4
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM4 Reagan Test Site (RTS) and its Instrumentation The Reagan Test Site (RTS) range instrumentation –Multiple RF sensors collecting data in several regions of electromagnetic spectrum –Multiple optical sensors collecting objects’ metrics and spectral characteristics –Telemetry systems capable of tracking multiple targets –Mobile and fixed ground safety instrumentation
5
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM5 What is RTS Operations Coordination Center (ROCC)? Network Flat Files Sensors Data Analysis Algorithms Decision Algorithms Current DMA Displays RTS instrumentation is controlled by the ROCC ROCC primary operations –Executes the prepared scenario for the acquisition session –Manages the data flow from multiple sensors –Processes the acquired data –Provides operator displays to track and predict the path of space objects –Stores the acquired data for later analysis and reporting –Facilitates training and simulation of performed activities
6
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM6 What kind of system is ROCC? Feedback control system block diagram Control is the process of making a system variable adhere to a particular value, called reference value A system designed to follow a changing reference is called tracking control system PLANT CONTROLLER feedback processor reference Input r(t) controlled variable c(t) feedback signal actuating signal m(t) error signal e(t) b(t) c(t) + - FORWARD PATH FEEDBACK PATH COMPARATOR ROCC is a tracking control system following the predefined reference input
7
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM7 Current ROCC DMA Block Diagram Planning Reference Data Report: Data analysis Output Data Data Plant Sensors Simulation Automatic Real-Time Processing & Analysis Manual Processing & Analysis Displays VoiceOperators Tracking Fusion Classification Identification Trajectory Estimation Tactical decision control loop ROCC controls the data acquisition, analysis and distribution processes Maximizes the quality of delivered data over specified time
8
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM8 ROCC Modernization Obsolete system hardware –Old central processors and boards are no longer supported –Not enough computational power to perform new tasks –Old components and interfaces are incompatible with modern technology Aging system software –Centralized monolithic architecture –Flat files for storing data –Use of old procedural languages –Alphanumeric displays Modernized system –Industry standard 32/64-bit Xeon or Opteron servers –Software vendor independence: Linux and Java –Database-based storage –Distributed architecture using publish/subscribe paradigm –Graphical user interface for visualization tools –Targeted dataflow rates: 5 MB/s (sustained), 10 MB/s (peak) –Data accumulation rate: 1 TB/year
9
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM9 New Data Management Architecture ROCC data management challenges –Support powerful high-precision instrumentation with almost real-time response –Support intensive and costly data collection process involving many human operators with high level of reliability –Support data analysis leading to changes in data acquisition environment –Be adequate for the wide range of transaction types – from simple real-time record reads and inserts to complex multidimensional analytical queries –Manage combination of streaming data with traditional structures –Provide request management, configuration management and data quality management capabilities Search for new data management architecture –New system represents conceptual change from the old architecture –Instrumentation and Control software traditionally concentrates on algorithm development and lacks good data architecture –Need for framework supporting “analysis – decision – execution” paradigm –Enterprise software is a leading implementer of distributed architecture and publish/subscribe paradigm
10
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM10 Outline Introduction Corporate Information Factory (CIF) for Data Management Architecture –What is Corporate Information Factory (CIF)? –CIF data flow diagram –CIF data –CIF layers –CIF logical component block diagram Designing ROCC data management architecture using CIF architecture Summary
11
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM11 What is Corporate Information Factory (CIF) ? * Information ecosystem is a model of corporate information processing –“CIF is the physical embodiment of the notion of an information ecosystem” CIF consists of the following components –External world –Applications –An integration and transformation layer (I & T layer) –An operational data store (ODS) –A data warehouse (DW) with current and historical detailed data –A data mart(s) –An internet and intranet –A metadata repository –An exploration and data mining warehouse –Alternative (secondary) storage –Decision support system (DSS) CIF approach could be used for modeling information processing in any organization (“forest vs. trees” view) * “Corporate Information Factory”, by W.H. Inmon, Claudia Imhoff, Ryan Sousa. Wiley; 2 edition (December 18, 2000)
12
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM12 CIF Data Flow Diagram DW Primary storage management Data acquisition Integration &Transform layer Reference data Application layer CRM (tx) eComm (tx) ERP (tx) BI (tx) Data delivery Exploration warehouse Data mining warehouse Statistical analysis DSS applications Finance Sales Marketing Accounting Data marts External world Enterprise transactions Internet Enterprise Resource Planning (ERP) ODS Historical reference data Operational reports External data Metadata management Row detailed data Operational layer Warehouse layer Report & Analysis layer eComm (rpt) CRM (rpt) ERP (rpt) BI (rpt) Alternative storage CRM = Customer Relation Management BI = Business Intelligence
13
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM13 CIF Data External data –Data is defined outside of corporation. Could have erroneous, redundant or unnecessary items –Data format is defined outside of corporation. Reformatting could be required Reference data –Allows to standardize on commonly used names for important and frequently used information –Allows consistent interpretation of corporate data across different departments –Could be aliases for common and often referred names Historical data –Volume of data – longer history more data –Usefulness of data – recent data is more useful than the older one –Granularity of data – older data likely be used on summary level ODS Applications Ancient historyRecent historyMost current activityImmediate future Corporate timeline Data DW
14
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM14 CIF Layers Application layer –Interacting directly with end user –Gathering detailed transaction data –Auditing and adjusting data –Editing data Integration and transformation layer –Combined non-integrated data from multiple application –Transform external data into corporate data –Creating appropriate metadata –Mathematical transformation –Reformatting and resequencing CRM (tx) eComm (tx) ERP (tx) BI (tx)
15
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM15 CIF Layers (Continued) Operational layer –Subject-oriented –Integrated –Volatile –Current-valued –Detailed –Normalized Warehouse layer –Subject-oriented –Integrated –Nonvolatile –Time-variant –Comprised of both summary and detailed data –Summary data optimized for Report & Analyses queries –Normalized and de-normalized data Report & Analysis layer –Statistical analysis –Exploration reporting –Data mining reporting –DSS analysis and reporting –Finance –Sales –Marketing –Accounting ODS Data Warehouse eComm (rpt) CRM (rpt) ERP (rpt) BI (rpt) Statistics
16
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM16 CIF Logical Component Block Diagram Corporate Goals Reference Data Data Plant Applications Tactical decision control loop Operational Data Store Real-time DSS Long-term DSS Data Warehouse Strategic decision control loop Output Data Corporate Report System controls the corporation resources using real-time and long-term DSS Maximized the expected profit of corporation over specified time
17
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM17 Outline Introduction Corporate Information Factory (CIF) for Data Management Architecture (DMA) Designing ROCC DMA using CIF architecture –ROCC data flow diagram –ROCC data –ROCC layers –ROCC logical component block diagram –Database selection –Three dangers of database design Summary
18
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM18 ROCC Data Flow Diagram Operational data Data acquisition Integration &Transform layer Reference data Archived data Space Data marts ODS Operational layer Warehouse layer Report & Analysis layer External world Multicast middleware Quick Look reports Planning … Post overview BET Impact Bias modeling Data mining warehouse Sensor control data Short-term reporting & analysis Long-term reporting & analysis RIB Best Choice Smoother Data Fusion Classifier DSS applications Secondary storage DW RIB = ROCC Interface Box
19
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM19 ROCC Data External data –Data is defined outside of ROCC. Could have erroneous, redundant, or unnecessary items –Data format is defined outside of ROCC. Reformatting or object conversion could be required Reference data –Comprise geophysics models and constants necessary for external data interpretation –Comprise common locations, sensor names, name of computers, programs –Comprise the user names, passwords, access rights and privileges Historical data –Operational data being migrated to the warehouse become historical data –Detailed historical data are used to produce summarized historical data –Historical data only inserted, never updated Planning data –Comprise configuration data for the sensors’ acquisition procedures –Comprise ROCC software components’ configuration data (XML format) –Comprise data to plan specific activities to acquire space objects’ coordinates
20
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM20 ROCC Layers External world –Simultaneous output from multiple sensors up to 10 MB/s –Capable to produce data autonomously –Capable to work under the guidance of DSS applications –Produces data as streams with considerable output rates Feedback from DSS applications Integration and transformation layer Plays vitally important role in reconciling the incoming external data content and format with the internal data requirements –Converts incoming data into appropriate Java objects –Creates necessary metadata –Mathematical transformation –Reformatting and resequencing RIB
21
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM21 ROCC Layers (continued) –Subject-oriented Focusing on basic transaction processing. Inserts and reads the streams of integrated and transformed sensor data Tracks, Ids, Control blocks, etc. –Integrated Physical unification and cohesiveness Uniform key structures Table naming conventions Common physical units and coordinate systems Data layouts and Metadata –Volatile ODS data could be updated (replaced) as a normal part of processing. After acquisition session is done the data are moved to the DW –Current-valued ODS data values are related to the current event (current acquisition session). For the next mission the ODS will be updated and its content will be moved to the DW (data migration) –Detailed ODS contains inserted values of the published sensor objects and does not expect to have summary data –Normalized ODS contains normalized data –Decision Support System Applications Makes real-time operational decisions like ID assignment, sensor allocation, etc ODS Best Choice Smoother Data Fusion Classifier DSS applications Operational Layer
22
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM22 ROCC ODS Specifics Data streams of objects –Streams of measurements usually don’t have very complex structures –Object-relational mapping is straightforward and not computationally intensive Indices –High-speed insertion does not allow to use indices –Relatively small size of the ODS allows to work without indices –Indices do exist in the DW Real-time DSS feedback –Could control the sensors, which in turn influences the input data –Typical analytical application assume that data producer is not changed during the query DW Network Secondary System Primary System ODS Network ODS Network Archive System Additional benefits Necessary operations could be performed during the copying Two operational databases could be used in parallel right after the acquisition Fault-tolerance (primary and secondary ODS)
23
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM23 ROCC Layers (continued) Historical (data warehouse) layer –Subject-oriented Organized like ODS around major ROCC entities, but focused on the modeling and analysis of data –Integrated Data migrated into DW from ODS are integrated with the rest of DW data –Time-variant Every datum in the data warehouse is identified with a particular time period. All summarized data are correct only for the particular period to whom the corresponding detailed data are identified with –Non-volatile There are no updates in the warehouse, only inserts. The past cannot be changed, only expanded –Comprised of both summary and detailed data Once detailed data from ODS migrated into DW, they became a part of history. In addition to the detailed historical data DW contains summary data. They are pre-calculated to reduce analytical query times –ROCC DW specifics ROCC DW does not use multidimensional data model yet, only summarized tables Data Warehouse
24
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM24 What is Angle Bias Modeling? Creation of a mathematical model to describe differences between reported and actual antenna pointing positions Δ Adjusted pointing using biases Raw pointing information Bias Corrected pointing information Bias model coefficients Data Warehouse Bias Modeling Application ODS RIB Real-time queries Storing sensor data streams Data migration Analytical queries Sensor data collection Sensor Control System ROCC Layers (continued) Continuous automatic monitoring of sensor metric performance Example: Angle Bias Modeling using ROCC Data Warehouse Analysis and Reporting layer
25
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM25 Angle Bias Modeling using ROCC Data Warehouse Organization of Sensor-Specific Summary Track Data in the Warehouse Observed Data Truth Data (Time-aligned and in Sensor Coord) Residual Data TimeRangeAzElIono CorrTropo CorrSNRRangeAzElDelta RngDelta AzSNRSource Bias Modeling Application Data Flow Generate Residuals Observed Data Atmospheric Data Data Truth Data Residual Data Multivariate Regression Bias Model AnalyticEquation Bias Model Coefficients Report Report Sensor Control System Data Warehouse Strategic decision control loop Data Warehouse
26
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM26 ROCC Logical Component Block Diagram Planning Reference Data Tactical decision control loop Strategic decision control loop Output Data Report Data Analysis Data Plant Sensors Simulation Displays VoiceOperators Operational Data Store Tactical real-time DSS Strategic long-term DSS Data Warehouse Bias ModelingSensor Comparison Operators ROCC controls the RTS resources using tactical and strategic DSS Maximizes the quality of collected data over specified time Tracking Fusion Classification Identification Trajectory Estimation
27
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM27 Database Selection Comparison criteria (qualitative values) MySQLOracleDB2 (IBM)SQL Server (Microsoft) PostgreSQL SpeedHigh Low SophisticationModerateHigh ReliabilityHigh ModerateLow Administration simplicity HighLow ModerateHigh StandardizationHighModerate SavingsHighLow High The same server should work adequately for both ODS and DW Deficiency in sophistication could be mitigated by custom programming
28
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM28 Three dangers of ROCC DMA design “Balkanization” of data –Different groups of data have different design –Attempt to fit data definitions into requirements of the existing tool –In the long run increase the maintenance cost Dialectism –Usage of specific database dialects –Deviation from existing SQL standards –Locks the user with specific vendor “Dirty” repository design –Part of the data stored in the database, another (closely related on) stored in the file system –Duplication of data between database and file system –Increases the maintenance const
29
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM29 Outline Introduction Corporate Information Factory (CIF) for Data Management Architecture Designing ROCC data management architecture using CIF Architecture Summary
30
MIT Lincoln Laboratory MySQL Users Conf. 04-19-2005 10/3/2015 5:21:13 PM30 Summary Modernization of the ROCC calls for a new type of data management architecture –New high-performance hardware –Significant increase of generated and managed volumes of data –Introduction of new services CIF satisfies the requirements –Designed to support large scale information system –Effectively manages different types of information queries –Provides flexibility in distributing data between multiple producers and consumers ODS and DW represent two types of repositories for information request –ODS supports near real-time storage requirements and targeted, low granular queries –DW is used for complex queries against summary-level data ODS and DW are parts of different control loops –ODS provides information for tactical decisions about near real-time data acquisition –DW delivers feedback for strategic decisions leading to system improvements MySQL is a good fit for ODS and DW databases –Good performance for fast queries in ODS –Capable of storing large amount of data in DW –Simple installation and licensing allow many independent servers to run inside one system being used as ODS, DW, data marts, etc. –Excellent Java support allows seamless integration with the rest of the software
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.