Presentation is loading. Please wait.

Presentation is loading. Please wait.

open source, advanced key-value store, data structure server

Similar presentations


Presentation on theme: "open source, advanced key-value store, data structure server"— Presentation transcript:

1 open source, advanced key-value store, data structure server
REmote DIctionary Server. open source, advanced key-value store, data structure server The following is a list of web resources and books covering Redis The Little Redis Book by Karl Seguin is a great free and coincise book that will get you started with Redis. Redis Cookbook (O'Reilly Media, 2011) Redis :  API: Tons of languages, Written in: C, Concurrency: in memory and saves asynchronous disk after a defined time. Append only mode available. Different kinds of fsync policies. Replication: Master / Slave, Misc: also lists, sets, sorted sets, hashes, queues. great  slides »   Admin UI » 

2 Hello Redis Install Start Redis server & make connection
$ wget $ tar xzf redis tar.gz $ cd redis-2.4.8 $ make Start Redis server & make connection redis-2.4.8]$ src/redis-server redis.conf [16849] 04 Mar 02:03:59 * Server started, Redis version 2.4.8 [16849] 04 Mar 02:03:59 * The server is now ready to accept connections on port 6379 Connect server $ src/redis-cli redis > set player:666:name binzhang OK redis > get player:666:name "binzhang"

3 Learn More : data structure server
Introduction to Redis Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets. You can run atomic operations on these types, like appending to a string; incrementing the value in a hash; pushing to a list; computing set intersection, union and difference; or getting the member with highest ranking in a sorted set. In order to achieve its outstanding performance, Redis works with an in-memory dataset. Depending on your use case, you can persist it either by dumping the dataset to disk every once in a while, or by appending each command to a log. Redis also supports trivial-to-setup master-slave replication, with very fast non-blocking first synchronization, auto-reconnection on net split and so forth. Other features include a simple check-and-set mechanism, pub/sub and configuration settings to make Redis behave like a cache. You can use Redis from most programming languages out there. Redis is written in ANSI C and works in most POSIX systems like Linux, *BSD, OS X and Solaris without external dependencies. There is no official support for Windows builds, although you may have some options.

4 Agenda Redis Manifesto 宣言
Data structures : strings, hashes, lists, sets and sorted sets. Leveraging Redis Redis Admin and maintenance The architecture of REDIS Cases

5 Redis Manifesto http://antirez.com/post/redis-manifesto.html
Redis is a DSL (Domain Specific Language) that manipulates abstract data types and implemented as a TCP daemon. keys are binary-safe strings and values are different kinds of abstract data types. Redis has persistence option but Memory storage is #1. The Redis API is a direct consequence of fundamental data structures. Code is like a poem. We believe designing systems is a fight against complexity. Most of the time the best way to fight complexity is by not creating it at all. Redis API has two levels: 1) a subset of the API fits naturally into a distributed version of Redis and 2) a more complex API that supports multi-key operations. We optimize for joy. When there is no longer joy in writing code, the best thing to do is stop. Contents in Introduction to Redis Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets. You can run atomic operations on these types, like appending to a string; incrementing the value in a hash; pushing to a list; computing set intersection, union and difference; or getting the member with highest ranking in a sorted set. In order to achieve its outstanding performance, Redis works with an in-memory dataset. Depending on your use case, you can persist it either by dumping the dataset to disk every once in a while, or by appending each command to a log. Redis also supports trivial-to-setup master-slave replication, with very fast non-blocking first synchronization, auto-reconnection on net split and so forth. Other features include a simple check-and-set mechanism, pub/sub and configuration settings to make Redis behave like a cache. You can use Redis from most programming languages out there. Redis is written in ANSI C and works in most POSIX systems like Linux, *BSD, OS X and Solaris without external dependencies. There is no official support for Windows builds, although you may have some options.

6 Redis Manifesto : where is Redis?
Redis is extremely fast, making it perfectly suited for applications that are write-heavy, data that changes often, and data that naturally fits one of Redis’s data structures (for instance, analytics data). A scenario where you probably shouldn’t use Redis is if you have a very large dataset of which only a small part is “hot” (accessed often) or a case where your dataset doesn’t fit in memory.

7 Who is using Redis Taobao Sina weibo And many others: Superfeedr
Vidiowiki Wish Internet Consulting Ruby Minds Boxcar Zoombu Dark Curse OKNOtizie Moodstocks uses Redis as its main database by means of Ohm. Favstar Heywatch Sharpcloud Wooga for the games "Happy Hospital" and "Monster World". Sina Weibo Engage PoraOra Leatherbound AuthorityLabs Fotolog TheMatchFixer Check-Host describes their architecture here. ShopSquad localshow.tv PennyAce Nasza Klasa Forrst Surfingbird

8 Agenda Redis Manifesto 宣言 Leveraging Redis The architecture of REDIS
Simple/fast/ Data structure : strings, hashes, lists, sets and sorted sets. Leveraging Redis The architecture of REDIS Redis Admin and maintenance Cases

9 Data structure: Key-Value Data Store
Keys are strings which identify pieces of data (values) Values are arbitrary byte arrays that Redis doesn't care about Redis is implemented as five specialized data structures Strings, hash, list, set, sort set Pub/Sub Querying with Redis ; the above make Redis fast and easy to use, but not suitable for every scenario

10 Data structure : KEY value
Before we dive into the specific data types, it is important to look at a few things you should keep in mind when designing the key structure that holds your data. a key can contain any characters, you can use separators to define a namespace with a semantic value for your business. An example might be using cache:project:319:tasks, where the colon acts as a namespace separator. When defining your keys, try to limit them to a reasonable size. Retrieving a key from storage requires comparison operations, so keeping keys as small as possible is a good idea. Additionally, smaller keys are more effective in terms of memory usage. Even though keys shouldn’t be exceptionally large, there are no big performance improvements for extremely small keys. This means you should design your keys in such a way that combines readability (to help you) and regular key sizes (to help Redis). Redis keys Redis keys are binary safe, this means that you can use any binary sequence as a key, from a string like "foo" to the content of a JPEG file. The empty string is also a valid key. A few other rules about keys: Too long keys are not a good idea, for instance a key of 1024 bytes is not a good idea not only memory-wise, but also because the lookup of the key in the dataset may require several costly key-comparisons. Too short keys are often not a good idea. There is little point in writing "u:1000:pwd" as key if you can write instead "user:1000:password", the latter is more readable and the added space is little compared to the space used by the key object itself and the value object. However it is not possible to deny that short keys will consume a bit less memory. Try to stick with a schema. For instance "object-type:id:field" can be a nice idea, like in "user:1000:password". I like to use dots for multi-words fields, like in "comment:1234:reply.to". Keys should not contain whitespace - versions of Redis prior to 1.2 had trouble with this, and even now it's not guaranteed that any edge-case bugs have been ironed out.

11 Data structure : KEY value
Redis support different kind of values Binary-safe strings. Lists of binary-safe strings. Hash map of strings Sets of binary-safe strings, that are collection of unique unsorted elements. Sorted sets, similar to Sets but where every element is associated to a floating number score. The elements are taken sorted by score. pubsub channels are a new addition to Redis Binary-safe strings means that values are essentially byte strings, and can contain any byte (including the null byte). This is possible because the Redis protocol uses pascal-style strings, prefixing any string with its length in bytes. pubsub channels are a new addition to Redis and are semantically different from the other types, existing in their own namespace (they shouldn't really be considered as a type of value that gets assigned to a key).

12 Data structure : strings, hashes, lists, sets , sorted sets.
Strings: the simplest and most basic data type (max 512M in length) Redis Strings are binary safe, this means that a Redis string can contain any kind of data, for instance a JPEG image or a serialized Ruby object. redis > users:leto "{name: leto, planet: dune, likes: [spice]}" OK redis > x users:leto "{name: leto, planet: dune, likes: [spice]}" (integer) 0 redis > strlen users:leto (integer) 42 redis > getrange users:leto 27 40 "likes: [spice]" redis > append users:leto " OVER 9000!!" (integer) 54 redis > get users:leto "{name: leto, planet: dune, likes: [spice]} OVER 9000!!" Binary-safe strings means that values are essentially byte strings, and can contain any byte (including the null byte). This is possible because the Redis protocol uses pascal-style strings, prefixing any string with its length in bytes. Earlier we learnt that Redis doesn't care about your values. Most of the time that's true. However, a few string commands are specific to some types or structure of values. As a vague example, I could see the above append and getrange commands being useful in some custom space-efficient serialization. As a more concrete example I give you the incr, incrby, decrand decrby commands next page. Strings command list: APPEND key value : Append a value to a key DECR/ DECRBY key : Decrement the integer value of a key by one or a number GET key :Get the value of a key GETRANGE key start end Get a substring of the string stored at a key GETSET key value : Set the string value of a key and return its old value INCR/ INCRBY key Increment the integer value of a key by one or by the given number MGET key [key ...] Get the values of all the given keys MSET key value [key value ...] Set multiple keys to multiple values SET key value Set the string value of a key SETEX key seconds value : Set the value and expiration of a key SETRANGE key offset value: Overwrite part of a string at key starting at the specified offset STRLEN key Get the length of the value stored in a key The C structure sdshdr declared in sds.h represents a Redis string: struct sdshdr { long len; long free; char buf[]; };

13 Data structure : strings, hashes, lists, sets , sorted sets.
if we store a counter key, we can use commands such as INCR (or INCRBY) and DECR (or DECRBY) to increment or decrement its contained value. To store page visit data, we could have a key “visits:pageid:totals” redis > SET visits:2:totals OK redis > get visits:2:totals " " redis > INCR visits:635:totals (integer) 1 redis > INCR visits:2:totals (integer) redis > get visits:2:totals " “ redis > strlen visits:2:totals (integer) 7 redis > incr users:leto (error) ERR value is not an integer or out of range You can do a number of interesting things using strings in Redis, for instance you can: Use Strings as atomic counters using commands in the INCR family: INCR, DECR, INCRBY. Append to strings with the APPEND command. Use Strings as a random access vectors with GETRANGE and SETRANGE. Encode a lot of data in little space, or create a Redis backed Bloom Filter using GETBIT and SETBIT. INCR is atomic. That even multiple clients issuing INCR against the same key will never incur into a race condition. For instance it can never happen that client 1 read "10", client 2 read "10" at the same time, both increment to 11, and set the new value of 11. The final value will always be of 12 and the read-increment-set operation is performed while all the other clients are not executing a command at the same time. As you can imagine, Redis strings are great for analytics. Try incrementing users:leto (a non-integer value) and see what happens (you should get an error).

14 Data structure : strings, hashes, lists, sets , sorted sets.
Much like traditional hashtables, hashes in Redis store several fields and their values inside a specific key. so they are the perfect data type to represent objects (eg: A User with a number of fields like name, surname, age, and so forth): Example: Designing a key namespace to store our users. redis> hset users:jdoe name "John Doe" (integer) 1 redis> hmset users:jdoe phone " " OK redis> hincrby users:jdoe visits 1 redis > hget users:jdoe phone " " A hash with a few fields (where few means up to one hundred or so) is stored in a way that takes very little space, so you can store millions of objects in a small Redis instance. Every hash can store up to field-value pairs (more than 4 billion). Use hashes when possible Small hashes are encoded in a very small space, so you should try representing your data using hashes every time it is possible. For instance if you have objects representing users in a web application, instead of using different keys for name, surname, , password, use a single hash with all the required fields. What is the maximum number of keys a single Redis instance can hold? and what the max number of elements in a List, Set, Ordered Set? In theory Redis can handle up to 232 keys, and was tested in practice to handle at least 250 million of keys per instance. We are working in order to experiment with larger values. Every list, set, and ordered set, can hold 232 elements. In other words your limit is likely the available memory in your system.

15 Data structure : strings, hashes, lists, sets , sorted sets.
HDEL key field [field ...] Delete one or more hash fields HGETALL key Get all the fields and values in a hash HINCRBY key field increment Increment the integer value of a hash field by the given number HKEYS key Get all the fields in a hash HLEN key Get the number of fields in a hash HMGET key field [field ...] Get the values of all the given hash fields HMSET key field value [field value ...] Set multiple hash fields to multiple values HSETNX key field value Set the value of a hash field, only if the field does not exist HVALS key Get all the values in a hash redis > hgetall users:jdoe 1) "name" 2) "John Doe" 3) " " 4) 5) "phone" 6) " " 7) "visits" 8) "1" HDEL key field [field ...] Delete one or more hash fields HEXISTS key field Determine if a hash field exists HGET key field Get the value of a hash field HGETALL key Get all the fields and values in a hash HINCRBY key field increment Increment the integer value of a hash field by the given number HKEYS key Get all the fields in a hash HLEN key Get the number of fields in a hash HMGET key field [field ...] Get the values of all the given hash fields HMSET key field value [field value ...] Set multiple hash fields to multiple values HSET key field value Set the string value of a hash field HSETNX key field value Set the value of a hash field, only if the field does not exist HVALS key Get all the values in a hash

16 Data structure : strings, hashes, lists, sets , sorted sets.
Redis Lists are simply lists of strings, sorted by insertion order. It is possible to add elements to a Redis List pushing new elements on the head (on the left) or on the tail (on the right) of the list. LPUSH mylist a # now the list is "a" LPUSH mylist b # now the list is "b","a" RPUSH mylist c # now the list is "b","a","c" (RPUSH was used this time) support for constant time insertion and deletion of elements near the head and tail, even with many millions of inserted items. Accessing elements is very fast near the extremes of the list but is slow if you try accessing the middle of a very big list, as it is an O(N) operation. You might want to use lists in order to implement structures such as queues The max length of a list is elements ( , more than 4 billion of elements per list). Implementing a Job Queue with lists Let’s implement our queues on top of lists, which provide atomic push/pop operations and have constant access time to the list’s head and tail. We’ll also keep a set that lists all the existing queues for introspection purposes. Since sets assure uniqueness, we don’t need to worry whether our queue already exists in the set. Quick Reference for Additions to Lists When dealing with indexes, the head of the list is element 0. When counting from the end, -1 refers to the last element, -2 to the next-to-last, etc. RPUSH list-name value Inserts the given value at the tail of the list-name list. Should this list be nil, it will be created. LPUSH list-name value Like RPUSH, but inserts the element at the head o f the list. LRANGE list-name start-index stop-index Returns the list elements in the specified range (including the rightmost element specified). LTRIM list-name start-index stop-index Trims the list so that it only contains the elements in the specified range. It’s similar to the LRANGE command, but instead of just returning the elements, it trims the list. LLEN list-name Returns the length of the given list. LREM list-name count value Removes count occurrences of value from the list. If count is positive, the elements are removed starting from the head, if it’s negative, they are removed starting from the tail, and if it’s 0, all occurrences of value are removed. Quick Reference for Removals from Lists LPOP list-name Removes and returns the element at the head of the list. RPOP list-name Like LPOP, but performs the action at the tail of the list. BLPOP list-name1 [list-name2 ...] timeout-value A blocking POP operation. It returns when any list has an element. If multiple lists have elements, list-name1 takes precedence over list-name2, and so forth. BRPOP list-name1 list-name2 ... timeout-value Like BLPOP, but performs the action at the tails of the lists.

17 Data structure : strings, hashes, lists, sets , sorted sets.
Model a timeline in a social network, using LPUSH in order to add new elements in the user time line, and using LRANGE in order to retrieve a few of recently inserted items. Lrange to do paging You can use LPUSH together with LTRIM (O(N))to create a list that never exceeds a given number of elements, but just remembers the latest N elements. Capped Collections in MongoDB. Lists can be used as a message passing primitive, BLPOP key [key ...] timeout Remove and get the first element in a list, or block until one is available BRPOP key [key ...] timeout Remove and get the last element in a list, or block until one is available LINDEX key index Get an element from a list by its index LLEN key Get the length of a list LPOP key Remove and get the first element in a list RPOP key Remove and get the last element in a list LRANGE key start stop Get a range of elements from a list LTRIM key start stop Trim a list to the specified range BLPOP key [key ...] timeout Remove and get the first element in a list, or block until one is available BRPOP key [key ...] timeout Remove and get the last element in a list, or block until one is available BRPOPLPUSH source destination timeout Pop a value from a list, push it to another list and return it; or block until one is available LINDEX key index Get an element from a list by its index LINSERT key BEFORE|AFTER pivot value Insert an element before or after another element in a list LLEN key Get the length of a list LPOP key Remove and get the first element in a list LPUSH key value [value ...] Prepend one or multiple values to a list LPUSHX key value Prepend a value to a list, only if the list exists LRANGE key start stop Get a range of elements from a list LREM key count value Remove elements from a list LSET key index value Set the value of an element in a list by its index LTRIM key start stop Trim a list to the specified range RPOP key Remove and get the last element in a list RPOPLPUSH source destination Remove the last element in a list, append it to another list and return it RPUSH key value [value ...] Append one or multiple values to a list RPUSHX key value Append a value to a list, only if the list exists

18 Data structure : strings, hashes, lists, sets , sorted sets.
Sets are an unordered collection of Strings. It is possible to add, remove, and test for existence of members in O(1) (constant time regardless of the number of elements contained inside the Set). Elements in a given set can have no duplicates. this means that adding a member does not require a check if exists then add operation. Sets are a natural fit for circles, because sets represent collections of data, and have native functionality to do interesting things like intersections and unions. The max number of members in a set is ( , more than 4 billion of members per set). You can track unique things using Redis Sets. Want to know all the unique IP addresses visiting a given blog post? Simply use SADD every time you process a page view. Redis Sets are good to represent relations. You can use Sets to extract elements at random using the SPOP or SRANDMEMBER commands. What is the maximum number of keys a single Redis instance can hold? and what the max number of elements in a List, Set, Ordered Set? In theory Redis can handle up to 232 keys, and was tested in practice to handle at least 250 million of keys per instance. We are working in order to experiment with larger values. Every list, set, and ordered set, can hold 232 elements. In other words your limit is likely the available memory in your system.

19 Data structure : strings, hashes, lists, sets , sorted sets.
We want to store several circles for each of our users, so it makes sense for our key to include a bit about the user and a bit about the actual circle. (circle:jdoe:family etc) redis> sadd circle:jdoe:family users:anna redis> sadd circle:jdoe:family users:richard redis> sadd circle:jdoe:family users:mike (integer) 1 redis> sadd circle:jdoe:soccer users:mike redis> sadd circle:jdoe:soccer users:adam redis> sadd circle:jdoe:soccer users:toby redis> sadd circle:jdoe:soccer users:apollo redis> smembers circle:jdoe:family 1) "users:richard" 2) "users:mike" 3) "users:anna" redis> hgetall users:mike (...) redis> sinter circle:jdoe:family circle:jdoe:soccer 1) "users:mike" redis> sunion circle:jdoe:family circle:jdoe:soccer 1) "users:anna" 2) "users:mike" 3) "users:apollo" 4) "users:adam" 5) "users:richard" 6) "users:toby"

20 Data structure : strings, hashes, lists, sets , sorted sets.
SADD key member [member ...] Add one or more members to a set SCARD key Get the number of members in a set SDIFF key [key ...] Subtract multiple sets SDIFFSTORE destination key [key ...] Subtract multiple sets and store the resulting set in a key SINTER key [key ...] Intersect multiple sets SISMEMBER key member Determine if a given value is a member of a set SMEMBERS key Get all the members in a set SMOVE source destination member Move a member from one set to another SPOP key Remove and return a random member from a set SRANDMEMBER key Get a random member from a set SREM key member [member ...] Remove one or more members from a set SUNION key [key ...] Add multiple sets SUNIONSTORE destination key [key ...] Add multiple sets and store the resulting set in a key SADD key member [member ...] Add one or more members to a set SCARD key Get the number of members in a set SDIFF key [key ...] Subtract multiple sets SDIFFSTORE destination key [key ...] Subtract multiple sets and store the resulting set in a key SINTER key [key ...] Intersect multiple sets SINTERSTORE destination key [key ...] Intersect multiple sets and store the resulting set in a key SISMEMBER key member Determine if a given value is a member of a set SMEMBERS key Get all the members in a set SMOVE source destination member Move a member from one set to another SPOP key Remove and return a random member from a set SRANDMEMBER key Get a random member from a set SREM key member [member ...] Remove one or more members from a set SUNION key [key ...] Add multiple sets SUNIONSTORE destination key [key ...] Add multiple sets and store the resulting set in a key

21 Data structure : strings, hashes, lists, sets , sorted sets.
Use set to implement tags sadd news:1000:tags 1 (integer) 1 sadd news:1000:tags 2 sadd news:1000:tags 5 sadd news:1000:tags 77 sadd tag:1:objects 1000 sadd tag:2:objects 1000 sadd tag:5:objects 1000 sadd tag:77:objects 1000 To get all the tags for a given object : redis>smembers news:1000:tags 1. 5 2. 1 3. 77 4. 2 we may want the list of all the objects having as tags 1, 2, 10, and 27 at the same time Sinter tag:1:objects tag:2:objects tag:10:objects tag:27:objects

22 Set: Wildcard autocomplete
Split every username to three letter chunks Simonw => sim, imo, mon, onw Create a set for each chunk Sim => { simonw, asimov, fasim} If the user types “simo”, return the intersection of the “sim” and “imo”.

23 Data structure : strings, hashes, lists, sets , sorted sets.
Every member of a Sorted Set is associated with score, that is to sort set, from the smallest to the greatest score. members are unique, scores may be repeated. With sorted sets you can add, remove, or update elements in a very fast way (in a time proportional to the logarithm of the number of elements, O(log(N))). Get ranges by score or by rank (position) in a very fast way. ZADD can be used both to add items to the set and to update the score of an existing member. The ZRANGE family of commands return items by their index position within the ordered set. The optional WITHSCORES argument returns the score for each item in the same response. ZRANGEBYSCORE query the ordered set by score, instead of by index. zadd friends:leto 100 ghanima 95 paul 95 chani 75 jessica 1 vladimir redis > zrange friends:leto 0 -1 withscores 1) "vladimir" 2) "1" 3) "jessica" 4) "75" 5) "chani" 6) "95" 7) "paul" 8) "95" 9) "ghanima" 10) "100"

24 Data structure : strings, hashes, lists, sets , sorted sets.
Zset as index : any time you need to look up data based on range queries, you should be storing it in a sorted set. They're indexes that you have to maintain yourself. redis > zadd hackers 1940 "Alan Kay" "Richard Stallman" "Linus Torvalds" "Alan Turing" redis > zrange hackers 0 -1 1) "Alan Turing" 2) "Alan Kay" 3) "Richard Stallman" 4) "Linus Torvalds“ redis > zrangebyscore hackers 1) "Richard Stallman" 2) "Linus Torvalds“ redis > zrangebyscore hackers -inf 1950 redis > zremrangebyscore hackers (integer) 2 redis > zrange hackers 0 10 2) "Linus Torvalds" ZADD key score member [score] [member] Add one or more members to a sorted set, or update its score if it already exists ZCARD key Get the number of members in a sorted set ZCOUNT key min max Count the members in a sorted set with scores within the given values ZINCRBY key increment member Increment the score of a member in a sorted set ZINTERSTORE destination numkeys key [key ...] [WEIGHTS weight [weight ...]] [AGGREGATE SUM|MIN|MAX] Intersect multiple sorted sets and store the resulting sorted set in a new key ZRANGE key start stop [WITHSCORES] Return a range of members in a sorted set, by index ZRANGEBYSCORE key min max [WITHSCORES] [LIMIT offset count] Return a range of members in a sorted set, by score ZRANK key member Determine the index of a member in a sorted set ZREM key member [member ...] Remove one or more members from a sorted set ZREMRANGEBYRANK key start stop Remove all members in a sorted set within the given indexes ZREMRANGEBYSCORE key min max Remove all members in a sorted set within the given scores ZREVRANGE key start stop [WITHSCORES] Return a range of members in a sorted set, by index, with scores ordered from high to low ZREVRANGEBYSCORE key max min [WITHSCORES] [LIMIT offset count] Return a range of members in a sorted set, by score, with scores ordered from high to low ZREVRANK key member Determine the index of a member in a sorted set, with scores ordered from high to low ZSCORE key member Get the score associated with the given member in a sorted set ZUNIONSTORE destination numkeys key [key ...] [WEIGHTS weight [weight ...]] [AGGREGATE SUM|MIN|MAX] Add multiple sorted sets and store the resulting sorted set in a new key

25 Sort set : Prefix autocomplete
Type “binz”, return “binzhang” Turn the first 4 or 5 characters of the strings into an integer (you can imagine every char as a digit of a radix 256 number for instance, but there are better representation) and add all your usernames into a sorted set with score=integer. Then using ZRANGEBYSCORE you can get all the elements between a given range ZRANGEBYSCORE key min max [WITHSCORES] [LIMIT offset count] This method is much more scalable as it's an O(log(N)) thing. (zrangebyscore is O(log(N)+M) )

26 sorted sets : Inverted-Index Text Search with Redis
an inverted index - a bunch of sets mapping terms to document IDs. Assign each document an ID Apply stemming and stopwords first Create an inverted index, with one set per word Create a set of docIDs for each term ZINTERSTORE destination-zset number-of-zsets-to-intersect zset1 [zset2 ...] [WEIGHTS weight1 [weight2 ...]] [AGGREGATE SUM | MIN | MAX] Quick Reference for Inverted-Index Algorithm ZINCRBY zset-name increment element Adds or increments the score of an element in a sorted set. As with ZADD and SADD, the set will be created if it doesn’t exist. ZINTERSTORE destination-zset number-of-zsets-to-intersect zset1 [zset2 ...] [WEIGHTS weight1 [weight2 ...]] [AGGREGATE SUM | MIN | MAX] Gets the intersection of a given number of ZSETS and store the result in a new ZSET. It’s also possible to pass along a multiplication factor for each ZSET (WEIGHTS) or to specify the aggregation function. By default, it’s a sum of the scores in all the sets, but it can also be the maximum or minimum value. ZREVRANGE zset-name start-index stop-index [WITHSCORES] Returns the elements in the sorted set within the given range, in descending order. The command can also optionally include the scores of the elements in the returned result. The ZRANGE command performs the same operation, but in ascending order.

27 Learn More Data structure : Pub/Sub
Redis has native support for the publish/subscribe (or pub/sub) pattern receivers subscribe to messages that match a specific pattern (for instance, messages that are sent to a specific “channel”), an procedurer/emitter to send messages to me emitter and receivers to be loosely coupled-- hey don’t need to know each other. The pub/sub command PSUBSCRIBE pattern [pattern ...] Listen for messages published to channels matching the given patterns PUBLISH channel message Post a message to a channel PUNSUBSCRIBE [pattern [pattern ...]] Stop listening for messages posted to channels matching the given patterns SUBSCRIBE channel [channel ...] Listen for messages published to the given channels UNSUBSCRIBE [channel [channel ...]] Stop listening for messages posted to the given channels Redis has direct support for the pub/sub pattern, meaning that it lets clients subscribe to specific channels matching a given pattern, and to publish messages to a given channel. This means that we can easily create channels like “chat:cars” for car-talk, or “chat:sausage” for food-related conversation. The channel names are not related to the Redis keyspace so you don’t have to worry about conflicts with existing keys. With this knowledge, it is trivial to implement chat and notification systems, either for end-users or to stream messages between logical parts of applications. Pub/sub can even be used as a building block of a robust queueing system.

28 Learn More Data structure : Pub/Sub
redis > subscribe irc:football Reading messages... (press Ctrl-C to quit) 1) "subscribe" 2) "irc:football" 3) (integer) 1 1) "message" 3) "have a good day" 3) "water" redis > PUBLISH irc:football "Rock you" (integer) 0 redis > PUBLISH irc:football "have a good day" (integer) 1 redis > PUBLISH irc:football "water"

29 Agenda Redis Manifesto 宣言
Data structure : strings, hashes, lists, sets and sorted sets. Leveraging Redis The architecture of REDIS Redis Admin and maintenance Cases

30 Leveraging Redis Operations on KEYS Big O Notation Sort EXPIRE
Transaction Optimistic locking using check-and-set ( select for update) Pipelining ( commands in batch) Big O Notation 时间复杂度

31 Keys operation DEL/EXISTS/EXPIRE/TTL/PERSIST/RANDOMKEY/RENAME
KEYS pattern Lists all the keys in the current database that match the given pattern. [slow] TYPE key-name Tells the type of the key. Possible types are: string, list, hash, set, zset, and none. MONITOR Outputs the commands received by the Redis server in real time. [debug purpose only] KEYS h*llo KEYS h?llo KEYS h[ae]llo. redis > type circle:jdoe:soccer Set DEL/EXISTS/EXPIRE/TTL/PERSIST/RANDOMKEY/RENAME Redis can handle up to 2^32 keys DEL key [key ...] Delete a key EXISTS key Determine if a key exists EXPIRE key seconds Set a key's time to live in seconds EXPIREAT key timestamp Set the expiration for a key as a UNIX timestamp KEYS pattern Find all keys matching the given pattern MOVE key db Move a key to another database OBJECT subcommand [arguments [arguments ...]] Inspect the internals of Redis objects PERSIST key Remove the expiration from a key RANDOMKEY Return a random key from the keyspace RENAME key newkey Rename a key RENAMENX key newkey Rename a key, only if the new key does not exist SORT key [BY pattern] [LIMIT offset count] [GET pattern [GET pattern ...]] [ASC|DESC] [ALPHA] [STORE destination] Sort the elements in a list, set or sorted set TTL key Get the time to live for a key TYPE key Determine the type stored at key EVAL script numkeys key [key ...] arg [arg ...] Execute a Lua script server side # What is the maximum number of keys a single Redis instance can hold? and what the max number of elements in a List, Set, Ordered Set? In theory Redis can handle up to 232 keys, and was tested in practice to handle at least 250 million of keys per instance. We are working in order to experiment with larger values. Every list, set, and ordered set, can hold 232 elements. In other words your limit is likely the available memory in your system.

32 Big O Notation How fast a command is based on the number of items we are dealing with. O(1): fastest, Whether we are dealing with 5 items or 5 million, you'll get the same performance. Sismember : if a value belongs to a set O(N) : linear commands, fts etc Keys ltrim, N is the number of elements being removed. O(log(N)): zadd is a O(log(N)) command, where N is the number of elements already in the set. O(log(N)+M) zremrangebyscore : N is the number of total elements in the set and M is the number of elements to be removed. O(N+M*log(M)) sort Redis documentation tells us the Big O notation for each of its commands. It also tells us what the factors are that influence the performance.

33 sort Sort the values within a list, set or sorted set.
redis > rpush users:leto:guesses (integer) 8 redis > sort users:leto:guesses 1) "2" 2) "2" 3) "4" 4) "5" 5) "9" 6) "10" 7) "10" 8) "19" redis 9> sadd friends:ghanima leto paul chani jessica alia duncan (integer) 6 redis > sort friends:ghanima limit 0 3 desc alpha 1) "paul" 2) "leto" 3) "jessica" Redis is single-thread, so sort on salve if large dataset Sort can store result to a key, one pattern for paginating through expensive sort results (millions of items, for example) is to save the result to a temporary key, set an expiry on it and use that for pagination via the LRANGE command.

34 Transaction Every Redis command is atomic, including the ones that do multiple things. incr is essentially a get followed by a set getset sets a new value and returns the original setnx first checks if the key exists, and only sets the value if it does not Msetnx fails if any key already exist (less important, now we’ve hash) MULTI command can run multiple commands as an atomic group. Marks the start of a transaction block. Subsequent commands will be queued for atomic execution using EXEC. DISCARD can be used in order to abort a transaction. In this case, no commands are executed and the state of the connection is restored to normal. The commands will be executed in order The commands will be executed as a single atomic operation (without another client's command being executed halfway through) That either all or none of the commands in the transaction will be executed redis > multi OK redis > hincrby groups:1percent balance QUEUED redis > exec 1) (integer) 2) (integer) SETNX is short for "SET if Not eXists". redis>  SETNX mykey "Hello" (integer) 1 redis>  SETNX mykey "World" (integer) 0 redis>  GET mykey "Hello“ If the server crashes mid-way through applying MULTI/EXEC block the partially applied changes will NOT be rolled back.

35 Optimistic locking using check-and-set
WATCH / UNWATCH keys are monitored in order to detect changes against them. If at least one watched key is modified before the EXEC command, the whole transaction aborts, and EXEC returns a Null multi-bulk reply to notify that the transaction failed. let's suppose Redis doesn't have INCR WATCH mykey val = GET mykey val = val + 1 MULTI SET mykey $val EXEC Using WATCH to implement ZPOP WATCH zset element = ZRANGE zset 0 0 ZREM zset element WATCH explained So what is WATCH really about? It is a command that will make the EXEC conditional: we are asking Redis to perform the transaction only if no other client modified any of the WATCHed keys. Otherwise the transaction is not entered at all. (Note that if you WATCH a volatile key and Redis expires the key after you WATCHed it, EXEC will still work. More on this.) WATCH can be called multiple times. Simply all the WATCH calls will have the effects to watch for changes starting from the call, up to the moment EXEC is called. You can also send any number of keys to a single WATCH call. When EXEC is called, all keys are UNWATCHed, regardless of whether the transaction was aborted or not. Also when a client connection is closed, everything gets UNWATCHed. It is also possible to use the UNWATCH command (without arguments) in order to flush all the watched keys. Sometimes this is useful as we optimistically lock a few keys, since possibly we need to perform a transaction to alter those keys, but after reading the current content of the keys we don't want to proceed. When this happens we just call UNWATCH so that the connection can already be used freely for new transactions.

36 Expiration 3. Lazy Expiration algorithm 4. Once every second,
Redis allows you to mark a key for expiration. You can give it an absolute time in the form of a Unix timestamp (seconds since January 1, 1970) or a time to live in seconds. expire pages:about delete key after 30 seconds expireat pages:about delete key at 12:00 a.m. December 31st, 2012. ttl pages:about check ttl persist pages:about remove expire limit setex pages:about 30 '<h1>about us</h1>’ set a string and specify a expire time 3. Lazy Expiration algorithm Keys are expired simple when some clients tries to access a key and the key is found to be time out 4. Once every second, Tests 100 random keys from expired keys set. Deletes all the keys found expired. If more than 25 keys were expired, it starts again from Redis as a cache used to work in two main ways: you could either set a time to live to cached entries. If you tune the TTL well enough, and you know how many new objects are created every second, you can avoid Redis using more than a given amount of RAM. I mentioned earlier that it's easy to be tripped up by the semantics of the EXPIRE command. This is why: for both replication and the append-only log file to work reliably, it is essential that replaying the same sequence of commands will result in the same underlying data structure being created. Timing based commands such as EXPIRE cannot be allowed to cause different results on a master and slave once replication lag between the two has been taken in to account. The restrictions on EXPIRE are all caused by this constraint.

37 What if no available memory
Redis will return an error on write operations, but read-only query still works Can specify “maxmemory” to define a hard limit for memory usage maxmemory-policy: specify the algorithm to use when we need to reclaim memory volatile-lru (default) remove a key among the ones with an expire set, trying to remove keys not recently used. volatile-ttl remove a key among the ones with an expire set, trying to remove keys with short remaining time to live. volatile-random remove a random key among the ones with an expire set. allkeys-lru like volatile-lru, but will remove every kind of key, both normal keys or keys with an expire set. allkeys-random like volatile-random, but will remove every kind of keys, both normal keys and keys with an expire set. LRU and minimal TTL algorithms are not precise algorithms for default Redis will check three keys(“maxmemory-samples”) and pick the one that was used less recently ################################### LIMITS #################################### # Set the max number of connected clients at the same time. By default there # is no limit, and it's up to the number of file descriptors the Redis process # is able to open. The special value '0' means no limits. # Once the limit is reached Redis will close all the new connections sending # an error 'max number of clients reached'. # # maxclients 128 # Don't use more memory than the specified amount of bytes. # When the memory limit is reached Redis will try to remove keys # accordingly to the eviction policy selected (see maxmemmory-policy). # If Redis can't remove keys according to the policy, or if the policy is # set to 'noeviction', Redis will start to reply with errors to commands # that would use more memory, like SET, LPUSH, and so on, and will continue # to reply to read-only commands like GET. # This option is usually useful when using Redis as an LRU cache, or to set # an hard memory limit for an instance (using the 'noeviction' policy). # WARNING: If you have slaves attached to an instance with maxmemory on, # the size of the output buffers needed to feed the slaves are subtracted # from the used memory count, so that network problems / resyncs will # not trigger a loop where keys are evicted, and in turn the output # buffer of slaves is full with DELs of keys evicted triggering the deletion # of more keys, and so forth until the database is completely emptied. # In short... if you have slaves attached it is suggested that you set a lower # limit for maxmemory so that there is some free RAM on the system for slave # output buffers (but this is not needed if the policy is 'noeviction'). # maxmemory <bytes> # MAXMEMORY POLICY: how Redis will select what to remove when maxmemory # is reached? You can select among five behavior: # volatile-lru -> remove the key with an expire set using an LRU algorithm # allkeys-lru -> remove any key accordingly to the LRU algorithm # volatile-random -> remove a random key with an expire set # allkeys->random -> remove a random key, any key # volatile-ttl -> remove the key with the nearest expire time (minor TTL) # noeviction -> don't expire at all, just return an error on write operations # Note: with all the kind of policies, Redis will return an error on write # operations, when there are not suitable keys for eviction. # At the date of writing this commands are: set setnx setex append # incr decr rpush lpush rpushx lpushx linsert lset rpoplpush sadd # sinter sinterstore sunion sunionstore sdiff sdiffstore zadd zincrby # zunionstore zinterstore hset hsetnx hmset hincrby incrby decrby # getset mset msetnx exec sort # The default is: # maxmemory-policy volatile-lru # LRU and minimal TTL algorithms are not precise algorithms but approximated # algorithms (in order to save memory), so you can select as well the sample # size to check. For instance for default Redis will check three keys and # pick the one that was used less recently, you can change the sample size # using the following configuration directive. # maxmemory-samples 3 redis > set newkey "maxsize" (error) ERR command not allowed when used memory > 'maxmemory'

38 Redis Pipelining(How)
Send multiple commands to the server without waiting for the replies at all, and finally read the replies in a single step. $ (echo -en "PING\r\nPING\r\nPING\r\n"; sleep 1) | nc localhost 6379 +PONG Not paying the cost of RTT for every call; Client: INCR X Server: 1 Server: 2 Server: 3 Server: 4

39 Redis Pipelining(why we need)
Redis is a TCP server using the client-server model and what is called a Request/Response protocol. The client sends a query to the server, and reads from the socket, usually in a blocking way, for the server response. The server processes the command and sends the response back to the client. So for instance a four commands sequence is something like this: Client: INCR X Server: 1 Server: 2 Server: 3 Server: 4 Network Round Trip: Latency?

40 Agenda Redis Manifesto 宣言
Data structure : strings, hashes, lists, sets and sorted sets. Leveraging Redis Keys/O(n)/sort/expire/transaction/watch Redis Admin and maintenance Select database Monitor Redis Configure Persistence Starting a Redis Slave Handling a Dataset larger than memory Upgrade Redis BackUp Redis Sharding Redis Benchmarks The architecture of REDIS Cases

41 A database contains a set of data.
Databases in Redis A database contains a set of data. A database is to group all of an application's data together and to keep it separate from another application's. databases are simply identified by a number with the default database being number 0. Number of databases is set via “databases” param in config change to a different database via select command redis [1]> select 1 OK redis [1]> select 0 redis > # Set the number of databases. #The default database is DB 0, you can select a different one on a per-connection basis using SELECT <dbid> # where dbid is a number between 0 and 'databases'-1 # databases 16

42 Monitor Redis MONITOR , is actually part of the Redis replication system. If you telnet directly to Redis and type monitor, you'll see a live dump of all commands executing against the database. This is really useful for debugging. config set slowlog-log-slower-than 0 Redis-stat : similar like prstat Info command redis > info redis_version:2.4.8 redis_git_sha1: redis_git_dirty:0 arch_bits:64 multiplexing_api:epoll gcc_version:4.1.2 process_id:5898 uptime_in_seconds:163519 uptime_in_days:1 lru_clock: used_cpu_sys:1.19 used_cpu_user:2.36 used_cpu_sys_children:0.00 used_cpu_user_children:0.00 connected_clients:2 connected_slaves:1 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 used_memory: used_memory_human:1.32M used_memory_rss: used_memory_peak: used_memory_peak_human:14.34M mem_fragmentation_ratio:2.63 mem_allocator:jemalloc-2.2.5 loading:0 aof_enabled:0 changes_since_last_save:0 bgsave_in_progress:0 last_save_time: bgrewriteaof_in_progress:0 total_connections_received:757 total_commands_processed:150064 expired_keys:0 evicted_keys:0 keyspace_hits:50019 keyspace_misses:3 pubsub_channels:0 pubsub_patterns:0 latest_fork_usec:1875 vm_enabled:0 role:master slave0: ,31424,online db0:keys=8,expires=0 db1:keys=1,expires=0 https://github.com/antirez/redis-tools src]$ ./redis-cli redis > info redis_version:2.4.8 redis_git_sha1: redis_git_dirty:0 arch_bits:64 multiplexing_api:epoll gcc_version:4.1.2 process_id:5898 uptime_in_seconds:163519 uptime_in_days:1 lru_clock: used_cpu_sys:1.19 used_cpu_user:2.36 used_cpu_sys_children:0.00 used_cpu_user_children:0.00 connected_clients:2 connected_slaves:1 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0 used_memory: used_memory_human:1.32M used_memory_rss: used_memory_peak: used_memory_peak_human:14.34M mem_fragmentation_ratio:2.63 mem_allocator:jemalloc-2.2.5 loading:0 aof_enabled:0 changes_since_last_save:0 bgsave_in_progress:0 last_save_time: bgrewriteaof_in_progress:0 total_connections_received:757 total_commands_processed:150064 expired_keys:0 evicted_keys:0 keyspace_hits:50019 keyspace_misses:3 pubsub_channels:0 pubsub_patterns:0 latest_fork_usec:1875 vm_enabled:0 role:master slave0: ,31424,online db0:keys=8,expires=0 db1:keys=1,expires=0

43 Redis Admin and maintenance: Configure Persistence(1)
Persistence Mode: snapshotting and AOF. It should be configured in a way that suits your dataset and usage patterns. snapshotting, which consists of saving the entire database to disk in the RDB format (a compressed database dump). This can be done periodically at set times, or every time a configurable number of keys changes. # save <seconds> <changes> save after 900 sec (15 min) if at least 1 key changed save after 300 sec (5 min) if at least 10 keys changed save after 60 sec if at least keys changed The alternative is using an Append Only File (AOF). This might be a better option if you have a large dataset or your data doesn’t change very frequently. ./redis-server --appendonly yes It is possible to combine both AOF and RDB in the same instance. Master ->> Slave can be a option. Both are sequential IO When Redis starts, it will read RDB or AOF to load all data into memory. ################################ SNAPSHOTTING ################################# # # Save the DB on disk: # save <seconds> <changes> # Will save the DB if both the given number of seconds and the given # number of write operations against the DB occurred. # In the example below the behaviour will be to save: # after 900 sec (15 min) if at least 1 key changed # after 300 sec (5 min) if at least 10 keys changed # after 60 sec if at least keys changed # Note: you can disable saving at all commenting all the "save" lines. save 900 1 save save # Compress string objects using LZF when dump .rdb databases? # For default that's set to 'yes' as it's almost always a win. # If you want to save some CPU in the saving child set it to 'no' but # the dataset will likely be bigger if you have compressible values or keys. rdbcompression yes # The filename where to dump the DB dbfilename dump.rdb # The working directory. # The DB will be written inside this directory, with the filename specified # above using the 'dbfilename' configuration directive. # Also the Append Only File will be created inside this directory. # Note that you must specify a directory here, not a file name. dir ./

44 Redis Admin and maintenance: Configure Persistence(2)
Snapshotting performs point-in-time snapshots of dataset at specified intervals. ( a full dump of your database to disk, overwriting the previous dump only if successful. Can manually trigger snapshotting with the SAVE and BGSAVE commands. BGSAVE forks the main Redis process and saves the DB to disk in the background. SAVE performs the same operation as BGSAVE but does so in the foreground, thereby blocking your Redis server. are also used when performing a master -> slave synchronization. Append Only File(AOF) keeps a log of the commands that change your dataset in a separate file. an append only log. no seeks, nor corruption problems (redis-check-aof ) Appendfsync: how often the AOF gets synched to disk (fsync syscall) : Always (be able to group commit), every sec, and no. BGREWRITEAOF rewrites the AOF to match the current database; can reduce size of AOF greatly. (For example, if you are incrementing a counter 100 times, you'll end up with a single key in your dataset containing the final value, but 100 entries in your AOF. 99 of those entries are not needed to rebuild the current state.) BGREWRITEAOF rewrites the AOF to match the current database. Depending on how often you update existing data, this will greatly reduce the size of the AOF. If your data changes very often, the on-disk file will grow very fast, so you should compact it by issuing BGREWRITEAOF regularly. The rewrite is done in the background. AOF guarantees a correct MULTI/EXEC transactions semantic, and will refuse to reload a file that contains a broken transaction at the end of the file. An utility shipped with the Redis server can trim the AOF file to remove the partial transaction at the end. Note: since the AOF file is populated using a single write(2) call at the end of every event loop iteration, an incomplete transaction can only appear if the disk where the AOF resides gets full while Redis is writing. So how durable is Redis, with its main persistence engine (AOF) in its default configuration? Worst case: It guarantees that write(2) and fsync(2) are performed within two seconds. Normal case: it performs write(2) before replying to client, and performs an fsync(2) every second. What is interesting is that in this mode Redis is still extremely fast, for a few reasons. One is that fsync is performed on a background thread, the other is that Redis only writes in append only mode, that is a big advantage.

45 Redis Admin and maintenance: Master --Slave
Use slave : Load balance read queires, standby, Backup ,DW queries master-slave replication natively: A master can have multiple slaves. Slaves are able to accept other slaves connections. Redis replication is non-blocking on the master side, this means that the master will continue to serve queries when one or more slaves perform the first synchronization. configure replication on the configuration file before starting a server slaveof master-ip-or-hostname masterport masterauth master-password by connecting to a running server and using the SLAVEOF command. SLAVEOF master-ip-or-hostname [masterport] CONFIG SET masterauth password How Redis replication works If you set up a slave, upon connection it sends a SYNC command. And it doesn't matter if it's the first time it has connected or if it's a reconnection. The master then starts background saving, and collects all new commands received that will modify the dataset. When the background saving is complete, the master transfers the database file to the slave, which saves it on disk, and then loads it into memory. The master will then send to the slave all accumulated commands, and all new commands received from clients that will modify the dataset. This is done as a stream of commands and is in the same format of the Redis protocol itself. You can try it yourself via telnet. Connect to the Redis port while the server is doing some work and issue the SYNC command. You'll see a bulk transfer and then every command received by the master will be re-issued in the telnet session. Slaves are able to automatically reconnect when the master <-> slave link goes down for some reason. If the master receives multiple concurrent slave synchronization requests, it performs a single background save in order to serve all of them. When a master and a slave reconnects after the link went down, a full resync is performed.

46 Handling a Dataset Larger Than Memory
Memory (VM) since version 2.0 (deprecated after Redis 2.4) . vm-enabled yes vm-swap-file Allow a dataset bigger than your available RAM by swapping rarely used values to disk and keeping all the keys and the frequently used values in memory. The keys are always kept in memory. Values can be swapped. Redis server might end up blocking clients in order to fetch the values from disk. Slow snapshot, Redis needs to read all the values swapped to disk in order to write them to the RDB file. AOF is better at this case. VM also affects the speed of replication, because Redis masters need to perform a BGSAVE when a new slave connects. SSDs such as Flash is encouraged IMPORTANT NOTE: Redis VM is now deprecated. Redis 2.4 will be the latest Redis version featuring Virtual Memory (but it also warns you that Virtual Memory usage is discouraged). We found that using VM has several disadvantages and problems. In the future of Redis we want to simply provide the best in-memory database (but persistent on disk as usually) ever, without considering at least for now the support for databases bigger than RAM. Our future efforts are focused into providing scripting, cluster, and better persistence. ********************************************************************************* Still, there are scenarios where using VM makes sense. In order to enable it, you’ll need to add this to your configuration file: vm-enabled yes There are other settings that you should pay attention to when enabling VM: • vm-swap-file specifies the location of the swap file in your filesystem. • vm-max-memory allows you to specify the maximum amount of memory Redis should use before beginning to swap values. Beware that this is a soft limit, because keys are always kept in memory and because Redis won’t swap values to disk while creating a new snapshot. • vm-pages specifies the number of pages in your swap file. • vm-page-size defines the size of a page in bytes. The page size and the number of pages are very important, because Redis won’t allocate more than one value to the same page, so together these determine the amount of data your swap file can handle. • vm-max-threads is the maximum number of threads available to perform I/O operations. Setting it to 0 enables blocking VM, which means that your Redis server will block all clients when it needs to read a value from disk. Once again, depending on your data access patterns, this may or may not be the best option. As with any other disk-based database, Redis VM will perform better the faster your I/ O is. So the use of SSDs such as Flash is encouraged. You can read more about VM use cases, configuration details, and tradeoffs in the official Redis documentation.

47 Upgrading Redis Redis can’t do online binary upgrades solution
starting a new Redis server in slave mode, switching over the clients to the slave promoting the new server to the master role. make sure to test before doing it on your production servers. To make theexample easier to understand, let’s assume we have a Redis server listening on port 6379. Install the new Redis version without restarting your existing server. 2. Create a new redis.conf, specifying that Redis runs on port 6380 (assuming you’re on the same system—if you’re not, you can still use 6379 or any other available port) and a different DB directory (you don’t want to have 2 Redis servers reading or writing the same files). 3. Start the new server. 4. Connect to the new server and issue the command: SLAVEOF localhost 6379 This will trigger a BGSAVE on the master server, and upon completion the new (slave) server will start replicating. You can check the current status using the INFO command on the slave. When you see master_link_status:up, the replication is active. 5. Since your new Redis server is now up-to-date, you can start moving over your clients to this new server. You can verify the number of clients connected to a server with the INFO command; check the connected_clients variable. 6. When all your clients are connected to the slave server, you still have two tasks to complete: disable the replication and shut down the master server. 1. Connect to the slave server and issue: SLAVEOF NO ONE This will stop replication and effectively promote your slave into a master. This is important in Redis 2.2. as master servers are responsible for sending expirations to their slaves. 2. Now connect to your old master server and issue: SHUTDOWN The old master server will perform a SAVE and shutdown. 3. Your new Redis system is up and running, but make sure that all your configuration files, init scripts, backups, etc. are pointing to the right location and starting the correct server.

48 Backing up Redis Depending on which Redis persistence model you’re using. With the default persistence model (snapshotting), you’re best off using a snapshot as a backup. /* cold backup */ redis-cli BGSAVE Copy If you’re using only AOF, you’ll have to back up your log in order to be able to replay it on startup. BGREWRITEAOF regularly redis-check-aof --fix filename backups on a slave Redis instance or Slave as a backup Should your Redis server refuse to start due to a corrupted AOF—which can happen if the server crashes or is killed while writing to the file—you can use the redis-checkaof utility to fix your AOF: redis-check-aof --fix filename

49 Sharding Redis Where is Redis Cluster
Under development. Probably reasonable beta for summer 2012 and ship the first stable one before end of 2012. Have to implemented in the client library or application you should probably use consistent hashing. you will not be able to perform some operations that affect multiple keys, because those keys might be in different shards (servers). Where's Redis Cluster? Redis development is currently focused on Redis 2.6 that will bring you support for Lua scripting and many other improvements. This is our current priority, however the unstable branch already contains most of the fundamental parts of Redis Cluster. After the 2.6 release we'll focus our energies on turning the current Redis Cluster alpha in a beta product that users can start to seriously test. It is hard to make forecasts since we'll release Redis Cluster as stable only when we feel it is rock solid and useful for our customers, but we hope to have a reasonable beta for summer 2012, and to ship the first stable release before the end of 2012.

50 Benchmarks--- How fast is Redis?
redis-benchmark utility that simulates SETs/GETs done by N clients at the same time sending M total queries src]$ ./redis-benchmark -q -n PING (inline): requests per second PING: requests per second MSET (10 keys): requests per second SET: requests per second GET: requests per second INCR: requests per second LPUSH: requests per second LPOP: requests per second SADD: requests per second SPOP: requests per second LPUSH (again, in order to bench LRANGE): requests per second LRANGE (first 100 elements): requests per second LRANGE (first 300 elements): requests per second LRANGE (first 450 elements): requests per second LRANGE (first 600 elements): requests per second

51 Benchmarks- Redis VS memcached
This perfect example is illustrated by the dialog between Redis (antirez) and memcached (dormando) developers. antirez 1 - On Redis, Memcached, Speed, Benchmarks and The Toilet dormando - Redis VS Memcached (slightly better bench) antirez 2 - An update on the Memcached/Redis benchmark You can see that in the end, the difference between the two solutions is not so staggering, once all technical aspects are considered. Please note both Redis and memcached have been optimized further after these benchmarks ... Finally, when very efficient servers are benchmarked (and stores like Redis or memcached definitely fall in this category), it may be difficult to saturate the server. Sometimes, the performance bottleneck is on client side, and not server-side. In that case, the client (i.e. the benchmark program itself) must be fixed, or perhaps scaled out, in order to reach the maximum throughput.

52 Benchmarks--- How fast is Redis?
Redis is a server: all commands involve network or IPC roundtrips. Cost of most operations is precisely dominated by network/protocol management. low latency network Redis commands return an acknowledgment for all usual commands. Redis is an in-memory data store with some optional persistency options. Some persistency option would bring latency. huge page & SSD Redis is a single-threaded server. It is not designed to benefit from multiple CPU cores. People are supposed to launch several Redis instances to scale out on several cores if needed. Redis favors fast CPUs with large caches and not many cores. This perfect example is illustrated by the dialog between Redis (antirez) and memcached (dormando) developers. antirez 1 - On Redis, Memcached, Speed, Benchmarks and The Toilet dormando - Redis VS Memcached (slightly better bench) antirez 2 - An update on the Memcached/Redis benchmark

53 Agenda Redis Manifesto 宣言
Data structure : strings, hashes, lists, sets and sorted sets. Leveraging Redis Redis Admin and maintenance Configure Persistence/Redis Slave/Handling a Dataset larger than memory Upgrade Redis/BackUp Redis/Sharding Redis/ benchmarks The architecture of REDIS How Redis works Latency in Redis Memory efficiency in 2.2 Redis Security Cases

54 How Redis works How a command received by a client is processed internally by Redis: Redis uses a single thread that manages synchronously all network connection. A thin event library has been implemented to abstract several unix system calls (epoll, select, kqueue). Requests are managed with commands. Using a command table and according what event is read from sockets a command handler is invoked to perform desired action. redis. is the server daemon. It is made by a single redis.c file which is about 6K LOC. networking. code for implementing networking. In particular the event-based logic which can be implemented by epoll, kqueue and select system calls datastructure. represents important data structures used by server. Crucial is for example sds.c which represents a redis string upon which all the code is written. redis-cli. is the command line client it initializes the event library creating the eventloop and the server socket. Eventually it enters into the main loop for managing I/O with clients: void aeMain(aeEventLoop *eventLoop) { e ventLoop->stop = 0; while (!eventLoop->stop) { if (eventLoop->beforesleep != NULL) eventLoop->beforesleep(eventLoop); aeProcessEvents(eventLoop, AE_ALL_EVENTS); }

55 Latency in Redis Latency induced by network and communication
use aggregated commands (MSET/MGET) and Pipelining Single threaded nature of Redis a mostly single threaded design ( I/O threads in background since 2.4) all the requests are served sequentially Latency generated by slow commands a request is slow to serve all the other clients will wait for this request to be served commands operating on many elements, like SORT, LREM, SUNION and others. For instance taking the intersection of two big sets can take a considerable amount of time. run all your slow queries on replciations Latency generated by fork The fork operation (running in the main thread) can induce latency by itself. Latency induced by swapping (operating system paging) Latency due to AOF and disk I/O Measuring latency If you are experiencing latency problems, probably you know how to measure it in the context of your application, or maybe your latency problem is very evident even macroscopically. However redis-cli can be used to measure the latency of a Redis server in milliseconds, just try: redis-cli --latency -h `host` -p `port` t Redis is mostly single threaded since actually from Redis 2.4 we use threads in Redis in order to perform some slow I/O operations in the background, mainly related to disk I/O, but this does not change the fact that Redis serves all the requests using a single thread.

56 Memory efficient for list
adlist.h: A generic doubly linked list implementation typedef struct list { listNode *head; listNode *tail; void *(*dup)(void *ptr); void (*free)(void *ptr); int (*match)(void *ptr, void *key); unsigned int len; } list; typedef struct listNode { struct listNode *prev; struct listNode *next; void *value; } listNode; O(1) is cool but *prev/*next would take amounts of bytes if *value is few bytes. Ziplist (list-max-ziplist-entries 512 & list-max-ziplist-value 64) Save memory by using a little more CPU Pack list in a single block of memory Value header holds encoding / value length O(memory size) LPUSH / LPOP Good fit for small payload, limited size adlist.h:typedef struct listNode { redis.h:typedef struct zskiplistNode { # Hashes are encoded in a special way (much more memory efficient) when they # have at max a given numer of elements, and the biggest element does not # exceed a given threshold. You can configure this limits with the following # configuration directives. hash-max-zipmap-entries 512 hash-max-zipmap-value 64 # Similarly to hashes, small lists are also encoded in a special way in order # to save a lot of space. The special representation is only used when # you are under the following limits: list-max-ziplist-entries 512 list-max-ziplist-value 64 # Sets have a special encoding in just one case: when a set is composed # of just strings that happens to be integers in radix 10 in the range # of 64 bit signed integers. # The following configuration setting sets the limit in the size of the # set in order to use this special memory saving encoding. set-max-intset-entries 512 # Similarly to hashes and lists, sorted sets are also specially encoded in # order to save a lot of space. This encoding is only used when the length and # elements of a sorted set are below the following limits: zset-max-ziplist-entries 128 zset-max-ziplist-value 64

57 Memory efficient for hash
Zmap (hash-max-zipmap-entries 512 & hash-max-zipmap-value 64) keys and values are prefixed length "objects", the lookup will take O(N) where N is the number of elements in the zipmap and *not* the number of bytes needed to represent the zipmap. Other data structure also have similar improve ( sort sets, intset) /* Memory layout of a zipmap, for the map "foo" => "bar", "hello" => "world": * * <zmlen><len>"foo"<len><free>"bar"<len>"hello"<len><free>"world" * <zmlen> is 1 byte length that holds the current size of the zipmap. * When the zipmap length is greater than or equal to 254, this value * is not used and the zipmap needs to be traversed to find out the length. * <len> is the length of the following string (key or value). * <len> lengths are encoded in a single value or in a 5 bytes value. * If the first byte value (as an unsigned 8 bit value) is between 0 and * 252, it's a single-byte length. If it is 253 then a four bytes unsigned * integer follows (in the host byte ordering). A value fo 255 is used to * signal the end of the hash. The special value 254 is used to mark * empty space that can be used to add new key/value pairs. * <free> is the number of free unused bytes * after the string, resulting from modification of values associated to a * key (for instance if "foo" is set to "bar', and later "foo" will be se to * "hi", I'll have a free byte to use if the value will enlarge again later, * or even in order to add a key/value pair if it fits. * <free> is always an unsigned 8 bit number, because if after an * update operation there are more than a few free bytes, the zipmap will be * reallocated to make sure it is as small as possible. * The most compact representation of the above two elements hash is actually: * "\x02\x03foo\x03\x00bar\x05hello\x05\x00world\xff" * Note that because keys and values are prefixed length "objects", * the lookup will take O(N) where N is the number of elements * in the zipmap and *not* the number of bytes needed to represent the zipmap. * This lowers the constant times considerably. */

58 Skip-list for sort set Consists of several levels, Each level is a sorted list All keys appear in level 1 If key x appears in level n, then it also appears in all levels below n An element in level n points (via down pointer) to the element with same key in the level below Each level has int_min and int_max Top points to the smallest element in the highest level

59 Memory structure lazy rehashing:The more operation you run into an hash table that is rhashing, the more rehashing "steps" are performed, so if the server is idle the rehashing is never complete and some more memory is used by the hash table. active rehashing: uses 1 millisecond every 100 milliseconds of CPU time in order to help rehashing the main Redis hash table (the one mapping top-level keys to values).

60 memory fragmentation Info String:
used_memory: memory allocated to redis used_memory_human:20.29M used_memory_rss: memory from OS, result of ps or top used_memory_peak: used_memory_peak_human:20.70M mem_fragmentation_ratio: = used_memory_rss/used_memory mem_allocator:jemalloc default in linux 2.4 and 2.6 String: dictEntry(12bytes)+sds(store key)+redisObject(12bytes)+sds(store value) Set hello word = 16(dictEtnry) + 16 (redisObject) + 16(“hello”) + 16(“world”),

61 Redis Security Redis is designed to be accessed by trusted clients inside trusted environments. Firewall on redis port Redis is not optimized for maximum security but for maximum performance and simplicity. Authentication feature The password is in clear text inside redis.conf file and client configuration AUTH command, like every other Redis command, is sent unencrypted Data encryption support - None Disabling of specific commands -- rename-command FLUSHALL ""

62 Agenda Redis Manifesto 宣言
Data structure : strings, hashes, lists, sets and sorted sets. Leveraging Redis Redis Admin and maintenance The architecture of REDIS Event library /Memory efficient Latency /Security Cases a simple Twitter clone Sina Weibo

63 CASE 1: http://lloogg.com/
Why did you started the Redis project? Originally Redis was started in order to scale LLOOGG. But after I got the basic server working I liked the idea to share the work with other guys, and Redis was turned into an open source project. List, lpush,ltrim show access history Strings:incr show pageveiws zset: show opt references, ref as a score

64 a simple Twitter clone Register: Circles and posts: Cookie:
INCR global:nextUserId => 1000 SET uid:1000:username antirez SET uid:1000:password p1pp0 SET username:antirez:uid 1000 Circles and posts: uid:1000:followers => Set of uids of all the followers users uid:1000:following => Set of uids of all the following user uid:1000:posts => a List of post ids, every new post is LPUSHed here. Cookie: SET uid:1000:auth fea5e81ac8ca77622bed1c2132a021f9 SET auth:fea5e81ac8ca77622bed1c2132a021f9 1000 New post INCR global:nextPostId => 10343 SET post:10343 "$owner_id|$time|I'm having fun with Retwis" Pageing $posts = $r->lrange($key,$start,$start+$count); LPUSH to user’s followers foreach($followers as $fid) { $r->push("uid:$fid:posts",$postid,false); } Push to latest news: $r->push("global:timeline",$postid,false); $r->ltrim("global:timeline",0,1000); Making it horizontally scalable Split by hash key

65 Sina weibo

66 Redis Admin and maintenance The architecture of REDIS Cases
Summary Redis Manifesto Memory #1;data structure server Data structure : strings, hashes, lists, sets and sorted sets, pub sub Leveraging Redis KEYS/Big O /Sort/EXPIRE/Transaction/Optimistic locking/Pipelining Redis Admin and maintenance Select /Redis/Persistence/Replication/ VM/Upgrade/BackUp/Sharding /Benchmarks The architecture of REDIS Event library /Memory efficient Latency /Security Cases / a simple Twitter clone /Sina Weibo

67 Redis Q&A


Download ppt "open source, advanced key-value store, data structure server"

Similar presentations


Ads by Google