Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discussion MySQL&Cassandra ZhangGang 2012/11/22. Optimize MySQL.

Similar presentations


Presentation on theme: "Discussion MySQL&Cassandra ZhangGang 2012/11/22. Optimize MySQL."— Presentation transcript:

1 Discussion MySQL&Cassandra ZhangGang 2012/11/22

2 Optimize MySQL

3 Index in MySQL: the dump include the index. –type_*_job has no index. –in_*_job has two indexes. –key_*_* has three indexes.

4 Optimize MySQL innodb_file_per_table [we can set it in my.cnf] –default innodb_file_per_table = OFF. Store the whole tables and index in one big file named ibdata1. –set innodb_file_per_table = ON. Can store each InnoDB table and its indexes in its own file. –effect: beforenow Diskspace groupby ProcessingType94s80s CPUTime group by Site (08-10)86s73s CPUTime group by Site (10-12)97s81s

5 Optimize MySQL innodb_buffer_pool_size: –as I know from the web that 70-80% of memory is a safe bet. –My computer’s memory is 2GB, I set the innodb_buffer_pool_size=1GB. But when run the script to communicate with MySQL, the computer becomes very slow and must restart it.

6 Learning Cassandra

7 RDBMS: use the ‘join’ operation, increase the normalization and reduce the redundancy. NoSQL : In contrast with the RDBMS, for getting a better performance and high scalability, get rig of ‘join’ operation, which means denormalizing the data and maintaining multiple copies of data(increase the redundancy). And this is what Cassandra do.

8 Learning Cassandra column-oriented? row-oriented? –Cassandra is based on Dynamo and BigTable. So it is not incorrect to say it is column-oriented. But each row has a unique key, which makes its data accessible, so it may be more helpful to think of it as an indexed, row-oriented store. –Cassandra stores data in a multidimensional hash table. That means you don’t have to decide ahead of time precisely what your data structure must look like, or what fields your records will need. –In Cassandra, we should think of our queries first, and then provide the data that answers them.

9 Learning Cassandra Installing Cassandra. –compare with Hadoop HBase, installing Cassandra is simple. Just download the source code and set the right JAVAHOME, input the command “ant”,then Cassandra is successfully installed. –start the Cassandra server: >>bin/cassandra –f –we can use the command line interface:>>bin/cassandra-cli

10 Learning Cassandra The Cassandra Data Model –Cassandra also has concepts like row, columfamily, column. But the meaning is different. –the column is a name/value pair------->cell –the columnfamily is a container for rows that have similar, but not identical, column sets----->table –the keyspace is the outermost container for data in Cassansra ----- > database

11 Learning Cassandra –If we wanted to create a group of related columns, Cassandra allows us to do this with something called a super column family. A super column family can be thought of as a map of maps.

12 Learning Cassandra Keyspace ------has a name and a set of attributes that define keyspace-wide behavior. There are some basic attributes that we can set per keyspace: –Replication factor: refer to the number of nodes that will act as copies of each row of data. –Replica placement strategy: refer to how the replicas will be placed in the cluster. (SimpleStrategy, OldNetworkTopologyStrategy, NetworkTopologyStrategy) –Column families: keyspace is a container for a list of one or more column families. Column families represent the structure of our data.

13 Learning Cassandra Column Families------is a container for an ordered collection of rows, it likes the table in RDBMS, but it’s not. –It’s schema-free because although the column families are defined, the columns are not. –A column family has two attributes: a name and a comparator. The comparator value indicates how columns will be sorted when they are returned to us in a query. –Cassandra column families as similar to a four-dimensional hash: [Keyspace][ColumnFamily][Key][Column] –If define the column families as super, it will be a five-dimensional hash : [Keyspace][ColumnFamily][Key][SuperColumn][SubColumn]

14 Learning Cassandra Column Family Options----- There are a few additional parameters that we can define for each column family: –keys_cached –rows_cached –read_repair_chance –preload_row_cache –…

15 Learning Cassandra Column Sorting---- In Cassandra, we specify how column names will be compared for sort order when results are returned to the client. Here are some choices: –AsciiType –BytesType –LongType –UTF8Type –… Sorting is a design decision –In RDBMS we can use order by to change the orders. In Cassandra, we can’t change the orders after we dictate the it when create a column family.

16 Learning Cassandra Secondary Indexes: –Secondary Indexes is supported from Cassandra 0.7. It means we can create indexes on column values. Denormalization: –Normalization is not an advantage when working with Cassandra because it performs best when the data model is denormalized. –Instead of modeling the data first and then writing queries, with Cassandra we model the queries and let the data be organized around them. Think of the most common query paths the application will use, and then create the column families that we need to support them.

17 Learning Cassandra Design patterns: –Materialized View: writing our data to a second column family that is created specifically to represent specified query. –Valueless Column: column name also can save useful information, often used in materialized view 。 –Aggregate Key : When use the Valueless Column pattern, we may also need to employ the Aggregate Key pattern.It likes xxx:xxx(use colon as the separator)

18 Learning Cassandra API & python library: –There is a client generation layer, provided by the Thrift API and the Avro project. –There are also high-level Cassandra clients different languages, for python, there has a library named pycassa. Users can easily use python to communicate with Cassandra by using pycassa. Now I'm getting familiar with it.

19 Thanks now discussing…


Download ppt "Discussion MySQL&Cassandra ZhangGang 2012/11/22. Optimize MySQL."

Similar presentations


Ads by Google