Download presentation
Presentation is loading. Please wait.
1
Non-traditional Databases List Of Questions
DBMAN 10 Non-traditional Databases List Of Questions
2
Non-traditional Databases List Of Questions
DBMAN 10 Non-traditional Databases List Of Questions
3
Non-Traditional = schema-less
Many implementations with each working very differently and serving a specific need Common aim: less complex storage, more flexibility, faster access, server clustering Usually also: arbitrary type of relations, or even relation-less structures NoSql = Not Sql = Not Only Sql
4
Comparison Structure and type of data being kept:
SQL/Relational databases require a structure with defined attributes to hold the data, unlike NoSQL databases which usually allow free-flow operations. Querying: Regardless of their licences, relational databases all implement the SQL standard to a certain degree and thus, they can be queried using the Structured Query Language (SQL). NoSQL databases, on the other hand, each implement a unique way to work with the data they manage.
5
Comparison Scaling: Both solutions are easy to scale vertically (i.e. by increasing system resources). However, being more modern (and simpler) applications, NoSQL solutions usually offer much easier means to scale horizontally (i.e. by creating a cluster of multiple machines). Reliability: When it comes to data reliability and safe guarantee of performed transactions, SQL databases are still the better bet.
6
Comparison Support: Relational database management systems have decades long history. They are extremely popular and it is very easy to find both free and paid support. If an issue arises, it is therefore much easier to solve than recently-popular NoSQL databases -- especially if said solution is complex in nature (e.g. MongoDB). Complex data keeping and querying needs: By nature, relational databases are the go-to solution for complex querying and data keeping needs. They are much more efficient and excel in this domain.
7
NoSQL advantages – “BIG DATA” – VVVC
Velocity, Variety, Volume, Complexity High data velocity – lots of data coming in very quickly, possibly from different locations. Data variety – storage of data that is structured, semi-structured and unstructured. Data volume – data that involves many terabytes or petabytes in size. Data complexity – data that is stored and managed in different locations or data centers.
8
Typical “BIG DATA” use cases
Flexible Data Models A NoSQL database is able to accept all types of data – structured, semi-structured, and unstructured – much more easily than a relational database which rely on a predefined schema Faster operations on semi-structured / unstructured data Analytics and Business Intelligence Ability to mine the data that is being collected “real-time data warehousing”: no need for middle-layer: no group by and join queries
9
Typical “BIG DATA” requirements – CAP
Modern Transactional Capabilities (vs ACID!) Consistency (all nodes see the same data at the same time – not the same as the “C” in ACID ) Availability (a guarantee that every request receives a response, even if not up-to date) Partition tolerance (the system continues to operate despite arbitrary partitioning due to network failures) P is a minimum in case of a communication error (partition) we usually must choose between A and C Some people say that CA without P = RDBMS, C+A+P = NoSQL
10
Typical “BIG DATA” requirements - PACELC
In case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C) Else (when no partitioning occurs: E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and consistency (C)
11
NoSQL types Graph database Wide column Key-value Document storage
LOTS of various implementation, extremely rapidly changing features and applications ... Especially with document databases and the chaotic world of JS in the past 2-3 years: “How it feels to learn JavaScript in 2016”
12
NoSQL types Graph database Data is stored in graphs
OrientDB, Neo4J, etc. Google: Linked data, Linked open data RDF = Resource Description Framework When to use Handling complex relational information Modelling and handling classifications
13
NoSQL types Column store / wide-column stores
sounds like the inverse of a standard database very high performance and a highly scalable architecture Apache Cassandra fastest When to use Keeping unstructured, non-volatile information: Scaling
14
NoSQL types Key-Value store
least complex, like Dictionary<string, string> in C# Memcached vs MemcacheDB (cache vs storage) Oracle NoSQL Database (Eventually-Consistent) When to use Caching Queueing: Distributing information / tasks Keeping live information
15
Document Storage formats
HTML = document descriptor, bad for storage, not strict XML, YAML = data descriptor, strict format JSON = object descriptor, strict format
16
Storage formats XML many tools, fast and memory efficient, hard to read, hard to parse, XPATH is official YAML few tools, 2x memory, easy to read, easy to parse JSON more and more tools, 2x memory, hard to read, easy to parse, JSONPATH is unofficial
17
NoSQL types Document database
expands on the basic idea of key-value stores “documents” contain data (using a storage format!) and each document is assigned a unique key Apache CouchDB, Lotus Notes JSON: MongoDB XML: BDB, Oracle, IBM DB2, SAP/Sybase, SQL Server, Oracle, PostgreSQL When to use Nested information JSON: JavaScript/programming language friendly (MEAN = MongoDB, Express.js, Angular, Node.js)
18
Document databases: XML databases
“XML Enabled” vs “native XML” databases The “big four” supports the XML type in CLOB fields Typically an XML enabled database is best suited where the majority of data are non-XML For datasets where the majority of data are XML, a native XML database is better suited. Native XML database: BDB – not really used any more... Defines a (logical) model for an XML document Has an XML document as its fundamental unit of (logical) storage not required to have any particular underlying physical storage model
19
Document databases: JSON / MongoDB
JSON-like documents (called BSON: Binary JSON) “Libbson expects that you are always working with UTF-8 encoded text” Ad hoc queries that can include user-defined JavaScript functions Indexing is similar to RDBMSes Replication/Load balancing: high availability with replica sets; one replica set = two or more copies of the data VERY efficient storage and operations for HIGHLY flexible data
20
document or BSON document
SQL MongoDB database table collection row document or BSON document column field index table joins embedded documents and linking primary key (automatically set to the _id field) aggregation (e.g. group by) aggregation pipeline
21
MongoDB vs SQL SQL: CREATE TABLE
MongoDB: Imlicitly created when inserting the first record db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) (later, we can insert users with more attributes) or: db.createCollection("users")
22
MongoDB vs SQL SQL: ALTER TABLE ... ADD ...
MongoDB: db.users.update( { }, { $set: { join_date: new Date() } }, { multi: true } )
23
MongoDB vs SQL SQL: ALTER TABLE ... DROP COLUMN ...
MongoDB: db.users.update( { }, { $unset: { join_date: "" } }, { multi: true } )
24
MongoDB vs SQL SQL: INSERT INTO / UPDATE / DELETE
MongoDB: db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) db.users.remove( { status: "D" } ) db.users.update( { age: { $gt: 25 } }, { $set: { status: "C" } }, { multi: true } ) db.users.update( { status: "A" } , { $inc: { age: 3 } }, { multi: true } )
25
MongoDB vs SQL SQL: SELECT ... FROM ... WHERE
MongoDB: db.users.find() -- where + fields db.users.find().limit(5).skip(10) db.users.find( { }, { user_id: 1, status: 1 } ) db.users.find( { status: "A" }, { user_id: 1, status: 1, _id: 0 } ) db.users.find( { age: { $gt: 25, $lte: 50 } } ) db.users.find( { user_id: /^bc/ } )
26
SQL MongoDB WHERE $match GROUP BY $group HAVING SELECT $project
ORDER BY $sort LIMIT $limit SUM() $sum COUNT() join No direct corresponding operator; use $unwind for somewhat similar functionality
27
MongoDB vs SQL SQL: SELECT COUNT(*) ... MongoDB:
db.users.count() or db.users.find().count() db.orders.aggregate( [ { $group: { _id: null, total: { $sum: "$price" } } } ] ) db.orders.aggregate( [ { $group: { _id: "$cust_id", total: { $sum: "$price" } } }, { $sort: { total: 1 } } ] )
28
MongoDB Usually works well if bi-directional speed / reports are not that important Practice showed, that a RamDisk + LOAD DATA INFILE + MyISAM can still be faster than the thousands of insert into commands on a MongoDB (everything depends on the design: a badly designed db with a fast db engine is always SLOWER AND WORSE than a well-designed db with a slow db engine) When using NoSql, “jury rigging” is harder Must be very careful about the configuration of the individual components
29
Non-traditional Databases List Of Questions
DBMAN 10 Non-traditional Databases List Of Questions
30
List of questions Types of data models (hierarchical, tree, relational, OO) Basic units of the RDBMS systems (table, column, record) Types of relations between entities/tables in an RDBMS (1:1, 1:N, N:M – and the implementation in SQL tables) Types of keys in RDBMS tables (PK, FK, Unique, Simple, Complex) Elements of an ER diagram (entity, field, relation) Purpose of normalization, anomalies Normalization levels: base model .. BCNF Verification of dependency preserving decomposition Verification of lossless decomposition
31
List of questions SQL: Types of table joins (inner, right outer, left outer, full outer) SQL: object types (table, view, cursor, procedure, function, trigger, user, role, login) SQL: list and purpose of sub-languages (DQL, DCL, DDL, DML), main commands, main dialects, main suffixes SQL: Constraint types (PK, FK, UNIQUE, NOT NULL, CHECK, auto_increment/identity/sequence) SQL: Purpose and usage of group by, GROUPING SETS, CUBE, ROLLUP OLTP vs OLAP comparison: use cases, differences Transaction management, ACID vs CAP vs PACELC
32
List of questions Storage models/layers of an RDBMS
RAID levels (0, 1, 1+0, 0+1, 5) Index properties (clustered/unclustered, dense/sparse, PK/FK/unique, simple/composite) OOP vs RDBMS: class relations (association, aggregation, composition) vs. SQL relations (HAS-A/PART-OF) OOP vs RDBMS: class inheritance vs. SQL tables (IS-A/INSTANCE-OF): single/class/concrete/generic NoSQL avantages/use cases/VVVC NoSQL database types (graph, column, key-value, document) Document databases: XML vs JSON vs BSON, MongoDB principles (no syntax required!)
33
Never forget! “The basic requirement for data consistency is the convergence of the coherence”
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.