Storage and Search -1 JN 2/19/2016 Jacob Nikom Optimizing Concurrent Storage and Retrieval Operations for Real-Time Surveillance Applications February.

Slides:



Advertisements
Similar presentations
Yukon – What is New Rajesh Gala. Yukon – What is new.NET Framework Programming Data Types Exception Handling Batches Databases Database Engine Administration.
Advertisements

Bruce Scharlau, University of Aberdeen, 2012 Data storage options for mobiles Mobile Computing.
MySQL Users Conf MIT Lincoln Laboratory Measuring MySQL Server Performance for Sensor Data Stream Processing Jacob Nikom MIT Lincoln Laboratory.
XIr2 Recommended Performance Tuning Andy Erthal BI Practice Manager.
By Daniela Floresu Donald Kossmann
Presented by, MySQL & O’Reilly Media, Inc. Falcon from the Beginning Jim Starkey
Caching the MDSPlus Data via Hibernate By Ajith M Jose Comp6703 Project Client: Raju Karia Supervisor: Dr. Henry Gardner (Development of “WebScope”)
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
Mike Jackson EPCC OGSA-DAI Today Release 2.2 Principles and Architectures for Structured Data Integration: OGSA-DAI.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
Chapter 1 Introduction to Databases
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
Service Broker Lesson 11. Skills Matrix Service Broker Service Broker, provides a solution to common problems with message delivery and consistency that.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
Application for Internet Radio Directory 19/06/2012 Industrial Project (234313) Kickoff Meeting Supervisors : Oren Somekh, Nadav Golbandi Students : Moran.
PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Configuration Management and Server Administration Mohan Bang Endeca Server.
4-1 INTERNET DATABASE CONNECTOR Colorado Technical University IT420 Tim Peterson.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
PHP Programming with MySQL Slide 8-1 CHAPTER 8 Working with Databases and MySQL.
About Dynamic Sites (Front End / Back End Implementations) by Janssen & Associates Affordable Website Solutions for Individuals and Small Businesses.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Goodbye rows and tables, hello documents and collections.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Lecture Set 14 B new Introduction to Databases - Database Processing: The Connected Model (Using DataReaders)
MySQL. Dept. of Computing Science, University of Aberdeen2 In this lecture you will learn The main subsystems in MySQL architecture The different storage.
© Dennis Shasha, Philippe Bonnet – 2013 Communicating with the Outside.
© Copyright by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Tutorial 30 – Bookstore Application: Client Tier Examining.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
PHP and MySQL CS How Web Site Architectures Work  User’s browser sends HTTP request.  The request may be a form where the action is to call PHP.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
DBSQL 12-1 Copyright © Genetic Computer School 2009 Chapter 12 Recent Concepts and Application of Databases.
Reaching out… through IT R Document Store - Pilot 001 Presented to.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
Airline Reservation System Dr.Paul Schragger Distributed Systems Presented By Archana Annamaneni.
By Shanna Epstein IS 257 September 16, Cnet.com Provides information, tools, and advice to help customers decide what to buy and how to get the.
© All rights reserved. U.S International Tech Support
DATABASE CONNECTIVITY TO MYSQL. Introduction =>A real life application needs to manipulate data stored in a Database. =>A database is a collection of.
Tycho: A General Purpose Virtual Registry and Asynchronous Messaging System Matthew Grove ACET Invited Talk February 2006.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Session 1 Module 1: Introduction to Data Integrity
An Efficient Threading Model to Boost Server Performance Anupam Chanda.
Bigtable: A Distributed Storage System for Structured Data
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
Unit-8 Introduction Of MySql. Types of table in PHP MySQL supports various of table types or storage engines to allow you to optimize your database. The.
System Architecture & Hardware Configurations Dr. D. Bilal IS 582 Spring 2008.
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
TerarkDB Introduction Peng Lei Terark Inc ( ). All rights reserved 1.
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
Oracle Announced New In- Memory Database G1 Emre Eftelioglu, Fen Liu [09/27/13] 1 [1]
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Understanding and Improving Server Performance
Indexes By Adrienne Watt.
Indexing Structures for Files and Physical Database Design
BDII Performance Tests
System Architecture & Hardware Configurations
ICT Database Lesson 1 What is a Database?.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Presentation transcript:

Storage and Search -1 JN 2/19/2016 Jacob Nikom Optimizing Concurrent Storage and Retrieval Operations for Real-Time Surveillance Applications February 19, :01:43 MIT Lincoln Laboratory

Storage and Search -2 JN 2/19/2016 Outline Introduction –Data Sources –Data Rates –Archiving Data Flow Hardware Selection Database Server Selection Schema Design Optimizing Search Operations Gauging Concurrent Insert and Search Operations Summary

Storage and Search -3 JN 2/19/2016 Data Sources Coastal Maritime Airborne

Storage and Search -4 JN 2/19/2016 Data Rates Max Feed Rates (msg/sec) Message Size (Numerical) (bytes) Numerical Data Throughput (KB/sec) Message Size (XML) (bytes) XML Data Throughput (KB/sec) Incoming data have to be stored in real time!

Storage and Search -5 JN 2/19/2016 Archiving Data Flow Ground sensor Airborne sensor Maritime sensor User/Client Presentation Tier Web/Application Server Logical Tier Database Server Data Tier Three-tier Architecture Publish/Subscribe OpenFire XML convert Archived Data Archiver Server XML convert

Storage and Search -6 JN 2/19/2016 Outline Introduction Hardware Selection –Hardware Architecture Comparison –Opteron — Xeon Conversion Performance Database Server Selection Schema Design Optimizing Search Operations Gauging Concurrent Insert and Search Operations Summary

Storage and Search -7 JN 2/19/2016 Hardware Architecture Comparison As the number of processing cores increases, so does competition for access to main memory Each core connects directly to the memory using Direct Connect architecture CORE 1 CORE 2 L2 L3 CORE 1 CORE 2 L2 MEMORY I/O CORE 1 CORE 2 L2 CORE 1 CORE 2 L2 MEMORY I/O Opteron Xeon Bottleneck

Storage and Search -8 JN 2/19/2016 Opteron—Xeon Conversion Performance AMD 2.5 GHz 65nm Barcelona vs Intel 3.0 GHz 45nm Xeon [AnandTech 11/27/2007] Dual Opteron GHz Dual Xeon GHz SPECjbb Opteron-based systems deliver better I/O with comparable CPU power Queries/sec 64-bit MySQL , SLES 10, SUN JDK 1.6.0_02 Where the value is above 0, Opteron is faster

Storage and Search -9 JN 2/19/2016 Outline Introduction Hardware Selection Database Server Selection –Single Client Performance –Multiple Clients Performance Schema Design Optimizing Search Operations Gauging Concurrent Insert and Search Operations Summary

Storage and Search -10 JN 2/19/2016 Single Client Performance

Storage and Search -11 JN 2/19/2016 Multiple Clients Performance MySQL server (MyISAM tables) has higher insertion rate and throughput

Storage and Search -12 JN 2/19/2016 Outline Introduction Hardware Selection Database Server Selection Schema Design –Initial Proposals –Alternative Proposals –JAXB Conversion Time Measurements Optimizing Search Operations Gauging Concurrent Insert and Search Operations Summary

Storage and Search -13 JN 2/19/2016 Initial Proposals Data Retrieval Operation XML MessageJava ObjectNumerical TableXML Table Data Insertion Operation XML Message Numerical Table XML Table Advantages Bypasses the marshalling of Java object into XML Storage of original XML messages Direct retrieval of XML messages Unmarshalling Disadvantages More complex schema Requires transactions to synchronize the records Marshalling ORM ORM – Object-Relational Mapping

Storage and Search -14 JN 2/19/2016 Alternative Proposals Data Retrieval XML MessageJava ObjectNumerical Table Data Insertion XML Message Java Object Numerical Table Problem: Can we unmarshal the Java object into XML string quickly enough? Advantages Bypasses the storage of long XML messages in the table Simplifies schema (only one table) Simplifies and accelerates the data retrieval Disadvantages Requires marshalling the object into XML string Marshalling UnmarshallingORM

Storage and Search -15 JN 2/19/2016 JAXB Conversion and Insertion Rates After 100,000 conversions JAXB conversion time drops almost 1000 times!

Storage and Search -16 JN 2/19/2016 JAXB Conversion Time Measurements Even in the case of 10,000 tracks in the message JAXB conversion takes less than a second Even in the case of 10,000 tracks in the message JAXB conversion takes less than a second

Storage and Search -17 JN 2/19/2016 JAXB Conversion Time Measurements (cont.) Even in the worst case the insertion speed keeps up with real-time requirements Number of garbage collection iterations Memory used = 140 MB Number of number of tracks in the message (log scale) Required rate No JAXB, No XML Yes JAXB, No XML Yes JAXB, Yes XML

Storage and Search -18 JN 2/19/2016 Outline Introduction Hardware Selection Database Server Selection Schema Design Optimizing Search Operations –Backtracking Operation –Search Data Flow –Real-time Data Indexation –Composite Indexes Gauging Concurrent Insert and Search Operations Summary

Storage and Search -19 JN 2/19/2016 Backtracking Operation How to find who launched the target? First request type: Specified : Object GlobalID Search Time Window Returns : List of tracks with specified GlobalID within specified Time Window and default region boundaries Second request type: Specified : Area boundaries Search Time Returns : List of tracks containing the GlobalIDs within specified area at the specified time latMax latMin lonMax lonMin altMin altMax t2 t1 Height Depth N Width

Storage and Search -20 JN 2/19/2016 Passes XML to client Converts records to XML string Search Data Flow Sends HTTP request to Web server Gets request and parses it Runs servlet Converts request parameters into query Runs query in database Retrieves selected records Displays back track Gets XML parses it User/Client Presentation tier Web/Application Server Logical tier Database Server Data tier Three-tier Architecture End User (Thick Client) Archived Data Tomcat Web Server Archiver Server XML convert Continuously inserts records into database Sensor Data XML convert Publish/Subscribe OpenFire Archiver Server XML convert Ground sensor Maritime sensor Airborne sensor

Storage and Search -21 JN 2/19/2016 Real-time Data Indexation Indexes help to find the matching rows very quickly by pre-sorting trackIdlatlontrackTime ……… trackIdlatlontrackTime ……… index ……… Traditional (batch) approach to indexing 1.Store data into the relational table 2.Add indexes to the table once the data accumulation is finished 3.Retrieve the data using added indexes Real-time (continuous) approach 1. Proceed with data insertion into the relational table 2. Allow indexes to be updated continuously 3. Retrieve the data using continuously updated indexes

Storage and Search -22 JN 2/19/2016 Composite Indexes 1.Optimize search for multiple columns simultaneously 2.Stored as B-trees and use leftmost prefix rule 3.Have to be constantly updated due to the insertion 4.Have to be stored on the disk 5.Indexed columns are trackGlobalId, lat, lon, alt, trackType, trackTime FIND row WHERE trackId = 19334, trackTime = Problem: How much real-time index updates slow down real-time insertions? FIND row WHERE lat = , lon = , alt = trackTime = Different request types require different queries 2.Multiple query parameters require composite index 3.Due to leftmost index column rule each query needs its own composite index Composite (multi-column) index

Storage and Search -23 JN 2/19/2016 Outline Introduction Hardware Selection Database Server Selection Schema Design Optimizing Search Operations Gauging Concurrent Insert and Search Operations –Insertion/Search Operations Benchmark –Indexed vs Non-indexed Searches –Measuring Insertion Slowdown Summary

Storage and Search -24 JN 2/19/2016 Insertion/Search Operations Benchmark Generating representative insertion data Outcome known in advance (natural data are not suitable) Selected values should be distributed all over the data set Six search parameters: trackGlobalId, lat, lon, alt, trackType, trackTime Values of selected parameters are not important Generated Archived Data ArchiverSearch Values RecTime = RecNumber * Δ time trackGlobalId lat, lon, alt Δ value N records Data properties for benchmarking: Content of the requesttrackGlobalId (VARCHAR)trackTime (BIGINT)trackTypelatlonalt Search for all tracks (contacts) in the specified area — XXXX — Search for all contacts with specified IDXXX ——— Search for all contacts near specified position — XXXXX Search queries and their indexed parameters

Storage and Search -25 JN 2/19/2016 Indexed vs Non-indexed Searches Non-indexed search Indexed search Indexes increase search performance by orders of magnitude!

Storage and Search -26 JN 2/19/2016 Measuring Insertion Slowdown Required insertion rate Required search time Indexes do slow down insertion, but the insertion rate remains very high

Storage and Search -27 JN 2/19/2016 Outline Introduction Hardware Selection Database Server Selection Schema Design Optimizing Search Operations Gauging Concurrent Insert and Search Operations Summary

Storage and Search -28 JN 2/19/2016 Summary Benchmarked and selected the hardware for Archive and Search Operations Demonstrated that Opteron-based systems deliver better I/O than Xeon-based with comparable CPU power Compared different database servers and selected MySQL database server based on insertion rate and throughput Designed different database schemata and selected the simplest one with sufficient insertion performance Investigated search operation performance with constant flow of Insert queries and demonstrated its unsatisfactory performance without indexes Designed search query indexes and measured the data retrieval acceleration Demonstrated that the longer insertion times due to the indexation are still sufficient for successful archive operations