Presentation is loading. Please wait.

Presentation is loading. Please wait.

COLUMN-BASED DBS BigTable, HBase, SimpleDB, and Cassandra.

Similar presentations


Presentation on theme: "COLUMN-BASED DBS BigTable, HBase, SimpleDB, and Cassandra."— Presentation transcript:

1 COLUMN-BASED DBS BigTable, HBase, SimpleDB, and Cassandra

2 But first, the third assignment This is due on Monday, the 18 th, by the beginning of class As with the first assignment, contact the grader when you are done Build a Neo4J database with the Neo4j web GUI (localhost:7474) and Cypher and/or Gremlin Note that the Console tab gives access to the documentation Also note that the Console tab gives access to Gremlin You can use either Cypher or Gremlin (or both) to do your assignment

3 3 rd assignment, continued Your Neo4J databases Model customer sites and service personnel Use at least 15 sites and 6 personnel Each site is a node Each service person is a node As calls come in a property is created for the given site that describes the nature of the problem a person is assigned to a node (and a relationship is made) Each node has a property that specifies the nature of its problem Each person has a property that specifies the sorts of problems he/she can solve

4 3 rd assignment, continued Support the following operations Creating a site Creating a service personnel Assigning a problem property to a site, sites can have many of these Assigning a specialty to a service personnel, personnel can have many of these Assigning a person to a site Removing a problem property and a relationship that corresponds to it Removing a site Removing a personnel Anything you want to add…

5 Column-based DBs BigTable First notable column-based DB No schema Sparse tables, e.g., no empty columns Groups (or families) of columns stored together

6 Basic concepts First column is a key Column structure is next Group of columns We can select all or a given column Idea is that the group is often accessed together Generally, new columns can be added to a row at run time, but new families might require going offline

7 Cassandra: columns and rows Basic unit of data A column is a name-value pair, the value is atomic The name is a key Each pair has a timestamp Used to manage update conflicts and old data A row Is a collection of columns associated with a row key This is a larger grained key – for a row, not a column A collection of similar rows is a column family

8 Cassandra: standard and super columns, and keyspaces If the columns in a family are simple, it is a standard column family The rows in a column family do not have to have the same structure You can add columns to rows without having to do it to other rows in the family A super column is a pair consisting of a name and a value, where the value is another map of columns Standard and super column families are kept in keyspaces, essentially, this is a database

9 Cassandra: updates and reads Updates Commit log is written to Update goes to in-memory store called memtable This means that it has succeeded Writes batched in memory and written to structures called SSTable Variable consistency Setting 1 is default for read, we get the first replica even if it is stale Subsequent reads will get the newest and this is called a read repair Good for high read throughput

10 Cassandra: writes Level 1 means Writes to a commit log and confirms to user Some writes might be lost if they are not propagated to other replicas Quarorum consistency For a read, means that majority respond to a read And the one with the newest timestamp is returned Nodes without the most recent version must do a read repair For a write has to be propagated to a majority of nodes before it is successful and client notified

11 Cassandra writes, continued The consistency level All All nodes must respond to a read or write This is very sensitive to nodes being down Notes A single application can use varying levels of consistency Uses a distributed cluster model No node in a cluster is a master

12 Cassandra: transactions Transactions Cannot perform a system of reads and writes and then decide whether to abort But there are apparently second party libraries that can be used to create true atomic transactions Writes are atomic at the row level So a column insertion or update is a single write that succeeds or fails There are transaction libraries that can be used to coordinate reads and writes

13 Cassandra: query language First, set your keyspace Query language Basic Get, Set, Delete operations Create a column family Set column value Get a column value or values Delete column family Delete column There are SQL-like commands SQL like set queries We can create indices on both row keys and column keys

14 Applications of Cassandra Content management systems Blogging systems

15 Installing Cassandra Go to: http://cassandra.apache.org/download/http://cassandra.apache.org/download/ Download and un-compress Look at: http://wiki.apache.org/cassandra/GettingStartedhttp://wiki.apache.org/cassandra/GettingStarted Go to the cassandra folder Run bin/cassandra –f On my mac, I needed to use sudo I also had to create the cassandra folders listed in the GettingStarted instructions Try running bin/cassandra-cli (command line interface)

16 Or to get it with a GUI Go to: http://blog.shelan.org/2012/06/cassandra-gui-20- making-things-little.htmlhttp://blog.shelan.org/2012/06/cassandra-gui-20- making-things-little.html Run wso2server.sh (or bat) Go to https://localhost:9443https://localhost:9443 Login into https://your-ip-address:9443/services (NOT localhost)https://your-ip-address:9443/services

17 Another choice Go to: http://www.datastax.com/resources/articles/getting- started-with-apache-cassandrahttp://www.datastax.com/resources/articles/getting- started-with-apache-cassandra Install Run it Go to: http://localhost:8888/opscenter/index.htmlhttp://localhost:8888/opscenter/index.html To explore example db: http://localhost:8888/opscenter/online_help/docs/explorer/ index.html http://localhost:8888/opscenter/online_help/docs/explorer/ index.html

18 Note on windows 7 You might have to set your JAVA_HOME variable Usually c:\Progra~1\Java\jdk1.7.0 (or similar)

19 PostgreSQL: install Go to: http://bitnami.org/stackshttp://bitnami.org/stacks Install WAPP (windows) or MAPP (mac) Startup web server Startup postgresql Go to: http://127.0.0.1/phppgadmin/http://127.0.0.1/phppgadmin/


Download ppt "COLUMN-BASED DBS BigTable, HBase, SimpleDB, and Cassandra."

Similar presentations


Ads by Google