Data Tier Options NWEN304 Advanced Network Applications.

Slides:



Advertisements
Similar presentations
DB glossary (focus on typical SQL RDBMS, not XQuery or SPARQL)
Advertisements

What is a Database By: Cristian Dubon.
Relational Database Alternatives NoSQL. Choosing A Data Model Relational database underpin legacy applications and meet business needs However, companies.
NoSQL Databases: MongoDB vs Cassandra
Introduction to Backend James Kahng. Install Node.js.
LBSC 690 Session #7 Structured Information: Databases Jimmy Lin The iSchool University of Maryland Wednesday, October 15, 2008 This work is licensed under.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
CSC 2720 Building Web Applications Database and SQL.
NoSQL Database.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
PostgreSQL and relational databases As well as assignment 4…
Copyright © 2003 by Prentice Hall Module 4 Database Management Systems 1.What is a database? Data hierarchy and data organization Field, record, file,
Systems analysis and design, 6th edition Dennis, wixom, and roth
DAY 14: ACCESS CHAPTER 1 Tazin Afrin October 03,
CST203-2 Database Management Systems Lecture 2. One Tier Architecture Eg: In this scenario, a workgroup database is stored in a shared location on a single.
Mr. Justin “JET” Turner CSCI 3000 – Fall 2015 CRN Section A – TR 9:30-10:45 CRN – Section B – TR 5:30-6:45.
NoSQL continued CMSC 461 Michael Wilson. MongoDB  MongoDB is another NoSQL solution  Provides a bit more structure than a solution like Accumulo  Data.
Getting Biologists off ACID Ryan Verdon 3/13/12. Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL.
Dbwebsites 2.1 Making Database backed Websites Session 2 The SQL… Where do we put the data?
Modern Databases NoSQL and NewSQL Willem Visser RW334.
PostgreSQL and relational databases As well as assignment 4…
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University
Changwon Nati Univ. ISIE 2001 CSCI5708 NoSQL looks to become the database of the Internet By Lawrence Latif Wed Dec Nhu Nguyen and Phai Hoang CSCI.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
NOSQL DATABASES Please remember to read the NOSQL Distilled book and the Seven Databases book.
DAY 12: DATABASE CONCEPT Tazin Afrin September 26,
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Visual Programing SQL Overview Section 1.
Introduction to MongoDB
CS453: Databases and State in Web Applications (Part 2) Prof. Tom Horton.
Lecture 8: Databases and Data Infrastructure CS 6071 Big Data Engineering, Architecture, and Security Fall 2015, Dr. Rozier.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT IT Monitoring WG Technology for Storage/Analysis 28 November 2011.
Data The fact and figures that can be recorded in system and that have some special meaning assigned to it. Eg- Data of a customer like name, telephone.
NoSQL: Graph Databases. Databases Why NoSQL Databases?
Chapter 3: Relational Databases
ASET 1 Amity School of Engineering & Technology B. Tech. (CSE/IT), III Semester Database Management Systems Jitendra Rajpurohit.
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
NoSQL databases A brief introduction NoSQL databases1.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.
CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.
CSCI-235 Micro-Computers in Science Databases. Database Concepts Data is any unorganized text, graphics, sounds, or videos A database is a collection.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Introduction to Database Programming with Python Gary Stewart
Big Data Yuan Xue CS 292 Special topics on.
Dive into NoSQL with Azure Niels Naglé Hylke Peek.
Agenda for Today  DATABASE Definition What is DBMS? Types Of Database Most Popular Primary Database  SQL Definition What is SQL Server? Versions Of SQL.
NoSQL: Graph Databases
and Big Data Storage Systems
CS122B: Projects in Databases and Web Applications Winter 2017
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Physical Database Design and Performance
NOSQL.
NOSQL databases and Big Data Storage Systems
Massively Parallel Cloud Data Storage Systems
NOSQL and CAP Theorem.
Teaching slides Chapter 8.
NoSQL Databases Antonino Virgillito.
Transaction Properties: ACID vs. BASE
Presentation transcript:

Data Tier Options NWEN304 Advanced Network Applications

Data Tier Options What are the options? How do you use them? What are the tradeoffs? How will we use this with Heroku?

What is the Scale of the Problem? hours of video are uploaded to YouTube [YouTube statistics].YouTube statistics 2.19,000 downloads from Apple’s App Store [Nerney 2012].Nerney ,000 photos are uploaded to SnapChat [Van Hoven 2014].Van Hoven ,000 tweets are posted on Twitter [Mirani 2013].Mirani ,000,000 likes on Facebook [Tepper 2012].Tepper ,000,000 messages sent on WhatsApp [Bushey 2014].Bushey ,000,000 s are sent [Knoblauch 2014].Knoblauch 2014

Databases are a solution An organised collection of data Essential to almost every business Emphasises scalability, reliability, security, efficiency, etc. Many different types of DBMS Optimised for different things

Relational Databases customer_idnamedate_of_birth 1Brian Kim Karen Johnson Wade Feinstein RDBMS have been dominant since the 1980s. (MySQL, PostgreSQL, DB2, SQL Server) Stores data in tables, rows, columns. Each table represents a collection of related items and each item is a row in the table An example table for bank customers:

Defining a Schema CREATE TABLE customers ( customer_id INT NOT NULL PRIMARY KEY, name VARCHAR(128), date_of_birth DATE ); Schema = structure of each table SQL - structured query language customer_idnamedate_of_birth

SQL: Inserting Data INSERT INTO customers (customer_id, name, date_of_birth) VALUES (1, "Brian Kim", " "); customer_idnamedate_of_birth 1Brian Kim

SQL: Retrieving Data SELECT * FROM customers WHERE name = 'Brian Kim'; customer_idnamedate_of_birth 1Brian Kim SELECT date_of_birth FROM customers WHERE name = 'Brian Kim'; date_of_birth

SQL: Multiple Tables account_idnamedate_of_birthcustomer_idaccount_typebalance 1Brian Kim cheque500 2Karen Johnson cheque8,500 3Brian Kim savings2,500 4Wade Feinstein checking160 Bank account details for each account

SQL: Multiple Tables account_idnamedate_of_birthcustomer_idaccount_typebalance 1Brian Kim cheque500 2Karen Johnson cheque8,500 3Brian Kim savings2,500 4Wade Feinstein checking160 What rows have to be updated if Brian decides to change his last name? Is there a better way?

SQL: Multiple Tables account_idcustomer_idaccount_typebalance 11cheque500 22cheque8,500 31savings2,500 43checking160 customer_idnamedate_of_birth 1Brian Kim Karen Johnson Wade Feinstein

SQL: Foreign Key CREATE TABLE accounts ( account_id INT NOT NULL PRIMARY KEY, customer_id INT FOREIGN KEY REFERENCES customers(customer_id), account_type VARCHAR(20), balance INT ); A foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table.

SQL: Enforcing Relationships INSERT INTO accounts (account_id, customer_id, account_type, balance) VALUES (1, 555, "checking", 500) The DB can now throw an error if you try to insert a row into the table with a customer_id that isn’t actually in the table: Error: Cannot add or update a child row: a foreign key constraint fails

SQL: Join: Combining Tables SELECT customers.name FROM customers JOIN accounts ON customers.customer_id = accounts.customer_id WHERE accounts.balance > 1000 name Karen Johnson Brian Kim

NoSQL NoSQL = “Not Only SQL” Developed by Internet companies focusing on dealing with demands in performance, availability and data volume. Great for large scale problems. Early versions: Google’s BigTable, Amazon’s Dynamo. “open source, distributed, non relational databases” Key-value stores, document stores, column-oriented databases and graph databases. We’ll focus on the first two because these are potential options for using with Heroku. There are a lot of them:

Key-Value Stores Examples: Redis, DynamoDB, Riak, Voldemort (LinkedIn). > put "the-key" "the-value" > get "the-key" version(0:1): "the-value" - They are optimized for a single use case: extremely fast lookup by a known identifier - Effectively they are a hash table distributed across many servers. - They do not use schemas, so you can store any kind of value. - The downside of this is that they are ‘opaque blobs’ and cannot support any querying mechanism other than lookups -Really useful for: web sessions, records with ids: user:$id:name = bob

Document Stores Like key-value stores except: values are typed => more complex queries key-values stored as JSON documents documents belong to collections Examples: MongoDB, CouchDB, Couchbase

- A collection is a group of MongoDB documents. Similar to an RDBMS table, documents in a collection are typically related or have a similar purpose. - Collections do NOT enforce a schema. - e.g. documents within a collection can have different fields >db.createCollection(“people”) { “ok” : 1 }

MongoDB: Inserting Data You don’t even need to do the createCollection, save is all that is required. db.people.save( {_id: "the-key", name: "Shawn", age: 24, locationId: 123})

MongoDB: Inserting Data No predefined schema for documents. Example (MongoDB): db.people.save( {_id: "the-key", name: "Shawn", age: 24, locationId: 123}) collection

MongoDB: Inserting Data No predefined schema for documents. Example (MongoDB): db.people.save( {_id: "the-key", name: "Shawn", age: 24, locationId: 123}) every document is identified by a key, you can choose this

MongoDB: Inserting Data IDs are auto generated if not explicitly specified. db.people.save( {name: "Shawn", age: 24, locationId: 123}) _id = ObjectId("545bdc1e")

MongoDB: Retrieving Data db.people.find() {"_id": "the-key", "age": 24, "name": "Shawn", "locationId": 123} {"_id": ObjectId("545bdc1e"), "age": 35, "name": "Bob", "locationId": 456} retrieve everything in the collection

MongoDB: Retrieving Data db.people.find({"name":"Shawn") {"_id": "the-key", "age": 24, "name": "Shawn", "locationId": 123} unlike key-value you can perform lookups on any field

Tradeoffs: Reading Data Database typeAccess typeJOIN RelationalVery flexible query modelYes Key-StorePrimary key lookupNo DocumentPrimary, secondary lookupNo Relational DBs are great for general purpose data storage. Very flexible query models etc. Other NoSQL databases are great for special purpose data storage. Can be optimized to fulfill specific usages.

Tradeoffs: Writing Data account_idbalance ,500 32,500 Consider a database storing balances for bank accounts

Tradeoffs: Writing Data Updating a field in a table, single value in a key-value store, single document in a document store easy. NoSQL databases are often optimised to do this. UPDATE accounts SET balance = balance - 50 WHERE account_id = 1 put "1" (get "1” + 100) db.accounts.update({_id: 1}, {$inc: {balance: -100}})

Tradeoffs: Writing Data Consider transferring $100 between two accounts. NoSQL DBs are not necessarily atomic, meaning there is no guarantee the system won’t crash and only do the first command > db.accounts.update({_id: 1}, {$inc: {balance: -100}}) > db.accounts.update({_id: 2}, {$inc: {balance: 100}})

Tradeoffs: Writing Data Relational databases support transactions. Atomic -- either both statements succeed or neither. START TRANSACTION; UPDATE accounts SET balance = balance WHERE account_id = 1; UPDATE accounts SET balance = balance WHERE account_id = 2; COMMIT;

Other tradeoffs Maturity: SQL databases much older. Some companies moved from noSQL to SQL (Pinterest). This will change. Scalability, replication, availability and consistency: can make tradeoffs with noSQL … more on this later.

General advice While developing start with relational because more flexible although potentially less able to deal with large amounts of data. Move to appropriate noSQL once you understand the problem space.

Heroku and Datastores During the project I suggest you use postgresql. (I think this is enforced now?) You will use this again in SWEN304 (Roma’s course?).