Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t DBCF GT IT Monitoring WG Technology for Storage/Analysis 28 November 2011.

Slides:



Advertisements
Similar presentations
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
Advertisements

Map/Reduce in Practice Hadoop, Hbase, MongoDB, Accumulo, and related Map/Reduce- enabled data stores.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
Relational Database Alternatives NoSQL. Choosing A Data Model Relational database underpin legacy applications and meet business needs However, companies.
Reporter: Haiping Wang WAMDM Cloud Group
HBase Presented by Chintamani Siddeshwar Swathi Selvavinayakam
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.
AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.
ZhangGang, Fabio, Deng Ziyan /31 NoSQL Introduction to Cassandra Data Model Design Implementation.
NoSQL by Michael Britton, Mark McGregor, and Sam Howard
Zhang Gang Big data High scalability One time write, multi times read …….(to be add )
: what’s all the buzz about?
Distributed Data Stores and No SQL Databases S. Sudarshan Perry Hoekstra (Perficient) with slides pinched from various sources such as Perry Hoekstra (Perficient)
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Getting Biologists off ACID Ryan Verdon 3/13/12. Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL.
WTT Workshop de Tendências Tecnológicas 2014
NOSQL By: Joseph Cooper MIS 409 MIS 409
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
Apache Cassandra - Distributed Database Management System Presented by Jayesh Kawli.
NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University
Changwon Nati Univ. ISIE 2001 CSCI5708 NoSQL looks to become the database of the Internet By Lawrence Latif Wed Dec Nhu Nguyen and Phai Hoang CSCI.
Cloud Computing Clase 8 - NoSQL Miguel Johnny Matias
Performance Evaluation on Hadoop Hbase By Abhinav Gopisetty Manish Kantamneni.
Key/Value Stores CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
By Vaibhav Nachankar Arvind Dwarakanath.  HBase is an open-source, distributed, column- oriented and sorted-map data storage.  It is a Hadoop Database;
Copyright © Curt Hill NoSQL Databases No SQL or Not Only SQL.
IBM Research ® © 2007 IBM Corporation A Brief Overview of Hadoop Eco-System.
Nov 2006 Google released the paper on BigTable.
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT IT Monitoring WG Update on the tool hunt & MonALISA monitoring.
NoSQL: Graph Databases. Databases Why NoSQL Databases?
1 HBASE – THE SCALABLE DATA STORE An Introduction to HBase XLDB Europe Workshop 2013: CERN, Geneva James Kinley EMEA Solutions Architect, Cloudera.
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
NoSQL databases A brief introduction NoSQL databases1.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
CMPE 226 Database Systems May 3 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.
Introduction to NoSQL Databases Chyngyz Omurov Osman Tursun Ceng,Middle East Technical University.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Data Tier Options NWEN304 Advanced Network Applications.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Look Mom! – NoSQL Charles Nurse | DotNetNuke Corp.
1 Ahmed K. Ezzat, Tradeoffs Between SQL and NoSQL Data Mining and Big Data.
1 Analysis on the performance of graph query languages: Comparative study of Cypher, Gremlin and native access in Neo4j Athiq Ahamed, ITIS, TU-Braunschweig.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Our experience with NoSQL and MapReduce technologies Fabio Souto.
Dive into NoSQL with Azure Niels Naglé Hylke Peek.
Why NO-SQL ?  Three interrelated megatrends  Big Data  Big Users  Cloud Computing are driving the adoption of NoSQL technology.
1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.
CS 405G: Introduction to Database Systems
NoSQL Know Your Enemy Shelly Noll Learning Care Group, Novi, MI
and Big Data Storage Systems
NoSQL Know Your Enemy Shelly Noll SRT Solutions, Ann Arbor, MI
CS122B: Projects in Databases and Web Applications Winter 2017
NoSQL Know Your Enemy Shelly Noll SRT Solutions, Ann Arbor, MI
Modern Databases NoSQL and NewSQL
NOSQL.
Christian Stark and Odbayar Badamjav
NOSQL databases and Big Data Storage Systems
NoSQL Systems Overview (as of November 2011).
Massively Parallel Cloud Data Storage Systems
NoSQL Databases Antonino Virgillito.
Charles Tappert Seidenberg School of CSIS, Pace University
Cloud Computing for Data Analysis Pig|Hive|Hbase|Zookeeper
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Presentation transcript:

Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT IT Monitoring WG Technology for Storage/Analysis 28 November 2011

Grid Technology NoSQL Overview Highlights –Non-relational –Distributed, Easy replication support –Open-source –Horizontally scalable, High scalability –Simple API Use cases –Large data volumes –Extreme query workloads –Schema evolution

Grid Technology The Zoo of solutions

Grid Technology Classification (data model) NoSQL key-value based –BerkleyDB, Dynamo, Veldemort, Redis, Scalaris, etc NoSQL column/tabular based –Hadoop, Cassandra, HBase, Hive, Hypertable, etc NoSQL document based –MongoDB, CouchDB, SimpleDB, Riak, etc Relational DBMS –Oracle, MySQL, etc Column based DBMS –Vertica, Infobright, LucidDB, etc

Grid Technology NoSQL Key-value Store Data items stored and paired with a key Data accessible by a hash map Fast storage/retrieval of simple data by primary key Complex queries are not straightforward Modeling applications can get complicated

Grid Technology NoSQL Document Store More complex and meaningful data structures Based on versioned structured documents Values associated with keys are full documents The documents are stored in formats like JSON Provides more modeling flexibility Good for incomplete datasets Easy to map data from object-oriented software

Grid Technology NoSQL Document Store MongoDBCouchDB Programming Language C++Erlang HDFS Support No (GridFS)No Document Format BSONJSON Query Method Object-based Javascript MapReduce Best UseDynamic queries Pre-defined queries Less dynamic data Supported / Used by Foursquare, SourceForge Several websites

Grid Technology NoSQL Column Store Each key is associated with many attributes Data stored as column families (similar to namespace for a set of related attributes) Most known because of Google’s BigTable implementation. Used by the largest and best supported NoSQL implementations Store and process very large amounts Very high throughput Strong partitioning support

Grid Technology NoSQL Column Store CassandraHBaseHypertableHive Programming Language Java C++Java HDFS Support Yes Batch Processing No Yes Query Method MapReduceMapreduceHQLHiveQL Best Use Real-time write Real-time read/write - Complex Queries Supported / Used by Facebook, Reddit, Digg Facebook, Adobe, Yahoo, Twitter Baidu Facebook, Amazon

Grid Technology Final Considerations Start prototyping with few use cases –Take few use cases spanning across different groups –One use case based on NoSQL document store –One or two use cases based on NoSQL column store –Each use case should involve 2+ groups –Try to maximize the collaboration between groups Get feedback from NoSQL team –Status of their work –Plan the next steps together Terminology and Shared Architecture - 10

Grid Technology Final Considerations Do not forget NoSQL distributions (as Cloudera) Do not forget (R)DBMS !