VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Data Modeling.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
Dr. Kalpakis CMSC 661, Principles of Database Systems Representing Data Elements [12]
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
ETEC 100 Information Technology
Introduction to Structured Query Language (SQL)
Introduction to Structured Query Language (SQL)
1 Chapter 2 Reviewing Tables and Queries. 2 Chapter Objectives Identify the steps required to develop an Access application Specify the characteristics.
A Social blog using MongoDB ITEC-810 Final Presentation Lucero Soria Supervisor: Dr. Jian Yang.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Read Lecturer.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Write Lecturer.
Chapter 6 Physical Database Design. Introduction The purpose of physical database design is to translate the logical description of data into the technical.
Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
MONGODB NOSQL SERIES Karol Rástočný 1. Prominent Users 2  AppScale, bit.ly, Business Insider, CERN LHC, craigslist, diaspora, Disney Interactive Media.
Systems analysis and design, 6th edition Dennis, wixom, and roth
ASP.NET Programming with C# and SQL Server First Edition
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor Ms. Arwa.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Schema 1 Lecturer.
MongoDB An introduction. What is MongoDB? The name Mongo is derived from Humongous To say that MongoDB can handle a humongous amount of data Document.
1 PHP and MySQL. 2 Topics  Querying Data with PHP  User-Driven Querying  Writing Data with PHP and MySQL PHP and MySQL.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Storage Techniques.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Goodbye rows and tables, hello documents and collections.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Schema 2 Lecturer.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Aggregation.
Lecture 12 Designing Databases 12.1 COSC4406: Software Engineering.
Copyright 2006 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Third Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Identity Constraints.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Exam and Lecture Overview.
7 1 Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
WEEK 1, DAY 2 STEVE CHENOWETH CSSE DEPT CSSE 533 –INTRO TO MONGODB.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Visual C# 2012 How to Program © by Pearson Education, Inc. All Rights Reserved.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
Copyright 2006 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Third Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
MongoDB is a database management system designed for web applications and internet infrastructure. The data model and persistence strategies are built.
CSE314 Database Systems Lecture 3 The Relational Data Model and Relational Database Constraints Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
GLOBEX INFOTEK Copyright © 2013 Dr. Emelda Ntinglet-DavisSYSTEMS ANALYSIS AND DESIGN METHODSINTRODUCTORY SESSION EFFECTIVE DATABASE DESIGN for BEGINNERS.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
MongoDB First Light. Mongo DB Basics Mongo is a document based NoSQL. –A document is just a JSON object. –A collection is just a (large) set of documents.
Session 1 Module 1: Introduction to Data Integrity
Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall Chapter 9 Designing Databases 9.1.
Introduction to MongoDB. Database compared.
DAY 14: ACCESS CHAPTER 1 RAHUL KAVI October 8,
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 12 Designing.
Orion Contextbroker PROF. DR. SERGIO TAKEO KOFUJI PROF. MS. FÁBIO H. CABRINI PSI – 5120 – TÓPICOS EM COMPUTAÇÃO EM NUVEM
CHAPTER 9 File Storage Shared Preferences SQLite.
SQL Basics Review Reviewing what we’ve learned so far…….
COMP 430 Intro. to Database Systems MongoDB. What is MongoDB? “Humongous” DB NoSQL, no schemas DB Lots of similarities with SQL RDBMs, but with more flexibility.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Cloud Data Models Lecturer.
Plan for Cloud Data Models
Mongo Database (Intermediate)
Lecturer : Dr. Pavle Mogin
Relational Database Design
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
MongoDB Distributed Write and Read
javascript for your data
Modern Systems Analysis and Design Third Edition
Teaching slides Chapter 8.
Chapter 12 Designing Databases
CS5220 Advanced Topics in Web Programming Introduction to MongoDB
Appendix D: Network Model
Presentation transcript:

VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Data Modeling Lecturer : Dr. Pavle Mogin

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 1 Plan for MongoDB Modelling Prologue Data Modelling –Embedding –Referencing Data Use and Performance: –Indexing –Reedings: Have a look at Readings on the Home Page

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 2 Prologue MongoDB is an open source project mainly driven by the company 10gen Inc The main goal of the development of MogoDb was to close the gap between fast and highly scalable key- value stores and feature rich RDBMSs Features: –Document Data Model and a relatively rich query language, –Data partitioning by sharding, –Master – slave mode of replication, –No data versioning Prominent users: –SourceForge.net, –Foursquare, –New York Times

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 3 Data Model A MongoDB installation hosts a number of databases A database is a physical data container of a set of collections A collection contains a set of documents A document contains a set of key-value (also referred to as field_name-field_value) pairs –The basic unit of data –Logically analogous to a JSON object Documents have a dynamic schema: –There is no predefined schema of a collection, –Documents are self describing, –In a collection, documents do not need to have the same structure, –Common fields belonging to different documents may have different data types

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 4 Document MongoDB stores all data in documents that are JSON- style data structures composed of key-value pairs: –All database records, –Query selectors (what records to select for rud operations), –Update definitions (which fields to modify), –Index specifications (what fields to index), –Data output by MongoDB Documents are stored on disk in the BSON format - a binary representation of JSON with an additional type information

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 5 Document Structure MongoDB documents are composed of field-and- value pairs and have the following structure: { field 1 : value 1, field 2 : value 2,..., field n : value n } The value of a field can be any of the BSON data types, including other documents, arrays, and arrays of documents

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 6 A Document Example var class = { _id: ObjectId (“509980df3”), course: {code: “SWEN432”, title: “Advanced DB”}, year: 2014, students: [“Matt”, “Jack”,..., “Lingshu”], no_of_st: 11 } The document contains values of varying types –The primary key is _id and it is of the ObjectId type, –The field course is a subdocument, and –students is an array of strings

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 7 Field Names Field names are strings Restrictions on field names: –The field names cannot start with the dollar sign ($) character, –The field names cannot contain the dot (.) character, –The field names cannot contain the null character

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 8 Field _id The field _id is reserved for use as a document’s primary key –Its value must be unique in the collection, –It is immutable, –May be of any type other than an array or a regular expression type –MongoDB creates a unique index on the _id field during the creation of a collection. –It is always the first field in the documents The following are common options for storing values for _id: –Use an ObjectId, –Use a natural unique identifier, if available This saves space and avoids an additional index, –Generate an auto-incrementing number

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 9 BSON Data Types MongoDB supports a decent number of built in BSON data types, some of them are: –Double –String –Array –Binary data –ObjectId –Boolean –Date –Null –Regular Expression –JavaScript –Integer (32-bit and 64-bit)

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 10 ObjectId ObjectId is a 12-byte BSON type, constructed using: –A 4-byte value representing the seconds since the Unix epoch, –A 3-byte machine identifier, –A 2-byte process id, and –A 3-byte counter, starting with a random value ObjectIds are small, most likely unique, and fast to generate MongoDB uses ObjectIds as the default value for the _id field if the _id field is not specified by a client Additional benefits of using ObjectIds for the _id field: –In the mongo shell, you can access the creation time of the ObjectId, using the getTimestamp() method, –Sorting on ObjectId values is roughly equivalent to sorting by creation time.

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 11 Constructing Values in mongo shell To generate a new ObjectId, use the ObjectId() constructor with no argument: var x = ObjectId() In this example, the value of x would be: ObjectId("507f1f77bcf86cd ") To return the timestamp of an ObjectId() object, use the getTimestamp() : ObjectId("507f191e810c19729de860ea").getTimestamp() This operation will return the following Date object: ISODate(" T20:46:22Z") Construct a date using the new Date() constructor: var mydate = new Date() Find the month portion of the mydate value mydate.getMonth()

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 12 The Key Data Modelling Decisions The key challenge in data modeling is balancing: –The needs of the application (queries, updates, data processing), –The performance characteristics of the database engine, and –The data retrieval patterns The key decision in designing a data model for MongoDB is how to represent relationships between data objects Two representation mechanisms: –Embedding and –Referencing

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 13 Embedded Relationships (Example) { _id: “SWEN432” title: “Advanced Databases”, coordinator: guest_lecturer: year: 2014, trimester: 1 } { name: “Pavle”, }, { name: “Aaron” },

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 14 Embedded Data Embedded documents capture relationships by storing related data objects within a single document –Subdocuments may be stored in fields or an array within another document –Leads to denormalization Denormalized data structures allow reading and manipulating related data in a single db operation Embedding is the preferred technique in the case of: –One – to – one relationships, and –One – to – many relationships with no extensive overlapping of objects on the many side A potential disadvantage of embedding is a possibly uncontrolled growth of a document through adding new objects on the many side

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 15 Referencing Assume each lecturer teaches a number of courses –Then embedding may lead to a considerable data redundancy An alternate approach is to store course and lecturer objects as separate documents and link them using references (document keys) In principle, references can be stored in the object on the one side, or in objects on the many side (or even on both sides) –Query patterns and the growth of the number of links influence the decision of link placement

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 16 Relationship by Referencing (Example) Course Document { _id: name: “Pavle”, teach: [“SWEN304”, “SWEN432”] } { _id: name: “Aaron”, } Lecturer Document { _id: “SWEN432”, title: “Advanced Databases”, coordinator:, guest_lecturer:, year: 2014, trimester: 1 }

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 17 Implementing Referencing MongoDB applications use one of two methods for relating documents: –Manual references where you save the _id field of one document in another document as a reference These references are simple and sufficient for most use cases –The other method is to use DBRefs MongoDB documentation recommends using manual references

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 18 Using Manual References var course_id = ObjectId() var coordinator_id = ObjectId() var guest_lec_id = ObjectId() db.classes.insert({ _id: course_id, title: “Advanced Databases”, coordinator: coordinator_id, guest_lecturer: guest_lec_id, year: 2014, trimester: 1 })})

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 19 When to Use Referencing Representing relationships by referencing produces normalized data models Referencing is a preferred technique: –When embedding leads to data redundancy but does not provide sufficient read performance advantages to outweigh the consequences of data duplication, –For representing many – to many relationships, and –To represent large hierarchical structures Referencing provides a more flexible data model than embedding at the expense of issuing follow-up queries to resolve references

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 20 Data Use and Performance The following phenomena and mechanisms influence performance of operations on MongoDB databases: –Atomicity of writes, –Document Growth, –Sharding, –Indexes, and –Capped Collections We comment here all of them except sharding –Sharding is considered within MongoDB Architecture

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 21 Atomicity of Writes(*) In MongoDB, write operations are atomic at the document level –A single write operation affects just one document within a single collection An embedded data model combines all related data for an entity in a single document –A single write operation inserts or updates all entity data A normalized data model splits data across several documents (and possibly collections) –Inserting a single entity requires several write operations that are not atomic collectively However, embedding results in less flexible schemes –A single entry point to data, –Hard to modify applications

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 22 Document Growth (*) Some updates as: –Pushing elements to an array, or –Adding new fields increase a document’s size If the document’s size exceeds the allocated space, MongoDB relocates the document on disk Relocation takes longer than in place updating and may lead to space fragmentation To avoid relocation, referencing instead of embedding should be used

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 23 Indexes MongoDB automatically creates a unique index on the _id field Indexes on fields (other than _id ) that appear often in queries improve performance for common queries Indexes are built as BTrees (facilitating range queries) Adding an index has some negative performance impact for write operations –For collections with high write-to-read ratio, indexes are expensive since each insert must also update any indexes Collections with high read-to-write ratio often benefit from additional indexes

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 24 createIndex() or ensureIndex() db.collection.createIndex(keys, options) Parameters: –keys of the type document: For each field to index, a key-value pair with the field and the index order: 1 for ascending or -1 for descending –options of the type document (optional) The most important options: –unique of the type Boolean : The default value is false –name of the type string : If unspecified, MongoDB generates an index name db.collection.ensureIndex( {_id: 1, year: -1}, {unique: true} )

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 25 Capped Collection(*) Capped collections are fixed-size collections that support those high-throughput operations that insert and retrieve documents based on insertion order Capped collections work in a way similar to circular buffers: once a collection fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection

Advanced Database Design and Implementation 2015 MongoDB_Data Modeling 26 Summary MongoDB data model: document oriented –A database is a container for a number of document collections, –Documents of a collection may have but don’t have to have the same structure, –A document is a set of key-value pairs in JSON format, –There are no field constraints and no referential integrity constraints, –The unique constraint is supported via createIndex() method The main issue with data modeling is how to represent relationships between entities Embedded relationships Relationships by references