Appearance
Basic Data Structures
Redis has five basic data structures: string, list, set, hash, and zset. Mastering these five fundamental structures is the most basic and important part of Redis knowledge, and they are commonly featured in Redis interview questions.
This section will guide Redis beginners through these five basic data structures quickly. Given the numerous commands in Redis, only the most common ones are explained here. If any common commands are omitted, readers can leave comments.
String
The string data structure is the simplest in Redis. All data structures in Redis are named with a unique key string, and this key is used to retrieve the corresponding value. The difference between various data structures lies in the structure of the value.
Strings are widely used, with a common application being caching user information. We serialize a user information structure into a JSON string and cache the serialized string in Redis. Similarly, retrieving user information involves a deserialization process.
Redis strings are dynamic and modifiable, similar to Java's ArrayList, using pre-allocated redundant space to reduce frequent memory allocation. The actual space allocated (capacity) is generally higher than the string length (len). When the string length is less than 1M, the space is doubled upon expansion; if it exceeds 1M, only an additional 1M is allocated. The maximum length for a string is 512M.
Key-Value Pairs
plaintext
> set name codehole
OK
> get name
"codehole"
> exists name
(integer) 1
> del name
(integer) 1
> get name
(nil)
Bulk Key-Value Operations
Multiple strings can be read and written in bulk, saving network overhead.
plaintext
> set name1 codehole
OK
> set name2 holycoder
OK
> mget name1 name2 name3 # returns a list
1) "codehole"
2) "holycoder"
3) (nil)
> mset name1 boy name2 girl name3 unknown
> mget name1 name2 name3
1) "boy"
2) "girl"
3) "unknown"
Expiry and Extended set
Commands
You can set an expiry time on keys, allowing automatic deletion, which is commonly used to control cache expiration. However, the "automatic deletion" mechanism is complex; if interested, refer to Section 26, "The Life and Death of Keys — Expiry Strategy."
plaintext
> set name codehole
> get name
"codehole"
> expire name 5 # expires after 5s
... # wait for 5s
> get name
(nil)
> setex name 5 codehole # expires after 5s, equivalent to set+expire
> get name
"codehole"
... # wait for 5s
> get name
(nil)
> setnx name codehole # creates if name does not exist
(integer) 1
> get name
"codehole"
> setnx name holycoder
(integer) 0 # set creation fails because name already exists
> get name
"codehole" # unchanged
Counting
If the value is an integer, you can increment it. Increments are bounded by the range of signed long. Exceeding this limit will cause an error.
plaintext
> set age 30
OK
> incr age
(integer) 31
> incrby age 5
(integer) 36
> incrby age -5
(integer) 31
> set codehole 9223372036854775807 # Long.Max
OK
> incr codehole
(error) ERR increment or decrement would overflow
Strings consist of multiple bytes, each made up of 8 bits, allowing a string to be viewed as a combination of many bits, which forms the basis of the bitmap data structure. The specific use of bitmaps will be discussed in later chapters.
For more on the internal structure of strings, see Section 32, "Deep Dive into 'Strings.'"
List
Redis lists are akin to Java's LinkedList; they are linked lists, not arrays. This means that insertion and deletion operations are very fast, with a time complexity of O(1), but indexing is slow, with a time complexity of O(n), which can be surprising.
When the last element is popped from a list, the data structure is automatically deleted and memory is reclaimed.
Redis lists are often used for asynchronous queues. Tasks that need delayed processing are serialized into strings and added to a Redis list, where another thread polls the data for processing.
Right In, Left Out: Queue
plaintext
> rpush books python java golang
(integer) 3
> llen books
(integer) 3
> lpop books
"python"
> lpop books
"java"
> lpop books
"golang"
> lpop books
(nil)
Right In, Right Out: Stack
plaintext
> rpush books python java golang
(integer) 3
> rpop books
"golang"
> rpop books
"java"
> rpop books
"python"
> rpop books
(nil)
Slow Operations
lindex
is similar to the get(int index)
method in Java linked lists, requiring traversal of the list, which degrades performance as the index increases.
ltrim
may be misleading in its name; it might be more aptly called lretain
, as it retains values within the specified range defined by start_index
and end_index
, discarding values outside this range. This allows for the implementation of a fixed-length list.
Indexes can be negative, with index = -1 representing the last element, and index = -2 representing the second-to-last element.
plaintext
> rpush books python java golang
(integer) 3
> lindex books 1 # O(n) use with caution
"java"
> lrange books 0 -1 # get all elements, O(n) use with caution
1) "python"
2) "java"
3) "golang"
> ltrim books 1 -1 # O(n) use with caution
OK
> lrange books 0 -1
1) "java"
2) "golang"
> ltrim books 1 0 # this actually clears the entire list, as the range length is negative
OK
> llen books
(integer) 0
Fast Lists
If we delve deeper, we find that Redis does not store lists as simple linked lists but as a structure called quicklist.
Initially, when the number of list elements is small, they are stored in contiguous memory using a structure called ziplist. This structure stores all elements closely together in contiguous memory. When the data volume increases, it switches to quicklist. Ordinary linked lists require significant additional pointer space, leading to space waste and increased memory fragmentation. For instance, if a list stores only int types, two additional pointers, prev
and next
, are still required. Therefore, Redis combines linked lists and ziplists to form quicklists, linking multiple ziplists with double pointers. This satisfies the need for fast insertions and deletions while minimizing space overhead.
Hash (Dictionary)
Redis's hash structure is equivalent to Java's HashMap; it is an unordered dictionary. The internal implementation is consistent with Java's HashMap, utilizing an array and linked list structure. When collisions occur in the first dimension's hash array, colliding elements are linked together in a chain.
Unlike Java's HashMap, Redis's hash values can only be strings, and the rehashing approach differs. Java's HashMap requires a time-consuming rehash operation when the dictionary grows large, rehashing everything at once. To maintain high performance without blocking service, Redis employs a progressive rehashing strategy.
Progressive rehashing retains both new and old hash structures during rehashing. Queries simultaneously check both structures, while subsequent scheduled tasks and hash operations gradually migrate the old hash's contents to the new structure. Once migration is complete, the new hash structure replaces the old one.
When the last element is removed from a hash, the data structure is automatically deleted, and memory is reclaimed.
Hash structures can also store user information. Unlike strings, which require full serialization of the entire object, hashes can store each field of a user structure individually, allowing for partial retrieval. Storing user information as a whole string wastes network bandwidth.
However, hashes have downsides; the storage overhead is higher than that of a single string, so the choice between hash and string should be carefully considered based on the situation.
plaintext
> hset books java "think in java" # Enclose strings with spaces in quotes
(integer) 1
> hset books golang "concurrency in go"
(integer) 1
> hset books python "python cookbook"
(integer) 1
> hgetall books # entries(), key and value appear alternately
1) "java"
2) "think in java"
3) "golang"
4) "concurrency in go"
5) "python"
6) "python cookbook"
> hlen books
(integer) 3
> hget books java
"think in java"
> hset books golang "learning go programming" # returns 0 for update operation
(integer) 0
> hget books golang
"learning go programming"
> hmset books java "effective java" python "learning python" golang "modern golang programming" # bulk set
OK
Similar to string objects, individual keys in a hash can also be counted using the hincrby
command, which operates similarly to incr
.
Old Money Gets Older
plaintext
> hincrby user-laoqian age 1
(integer) 30
For details on the internal structure of dictionaries, see Section 33, "Deep Dive into 'Dictionaries.'"
Set
Redis's set structure is akin to Java's HashSet, with unordered and unique key-value pairs. Its internal implementation resembles a special dictionary where all values are NULL.
When the last element is removed from a set, the data structure is automatically deleted, and memory is reclaimed.
Sets can be used to store user IDs of prize winners, ensuring that the same user cannot win multiple times due to the deduplication feature.
plaintext
> sadd books python
(integer) 1
> sadd books python # duplicate
(integer) 0
> sadd books java golang
(integer) 2
> smembers books # order not guaranteed, as set is unordered
1) "java"
2) "python"
3) "golang"
> sismember books java # check if a value exists
(integer) 1
> sismember books rust
(integer) 0
> scard books # get length
(integer) 3
> spop books # pop one element
"java"
Zset (Sorted Set)
Zset might be the most distinctive data structure provided by Redis and is often a favorite topic in interviews. It combines features of Java's SortedSet and HashMap, ensuring unique values while assigning each a score that represents sorting weight. The internal implementation uses a data structure called a "skip list."
When the last value in a zset is removed, the data structure is automatically deleted, and memory is reclaimed.
Zsets can store fan lists, with user IDs as values and follow time as scores, allowing for sorting by follow time. They can also store student scores, with student IDs as values and exam scores as scores, enabling ranking based on scores.
plaintext
> zadd books 9.0 "think in java"
(integer) 1
> zadd books 8.9 "java concurrency"
(integer) 1
> zadd books 8.6 "java cookbook"
(integer) 1
> zrange books 0 -1 # list by score
1) "java cookbook"
2) "java concurrency"
3) "think in java"
> zrevrange books 0 -1 # list by score in reverse order
1) "think in java"
2) "java concurrency"
3) "java cookbook"
> zcard books # get count
(integer) 3
> zscore books "java concurrency" # get score of specified value
"8.9000000000000004" # double precision can lead to decimal precision issues
> zrank books "java concurrency" # get rank
(integer) 1
> zrangebyscore books 0 8.91 # traverse zset by score range
1) "java cookbook"
2) "java concurrency"
> zrangebyscore books -inf 8.91 withscores # traverse zset by score range (-∞, 8.91], with scores
1) "java cookbook"
2) "8.5999999999999996"
3) "java concurrency"
4) "8.9000000000000004"
> zrem books "java concurrency" # delete value
(integer) 1
> zrange books 0 -1
1) "java cookbook"
2) "think in java"
Skip List
The internal sorting feature of zsets is implemented using the "skip list" data structure, which has a unique and complex structure.
Since zsets must support random insertions and deletions, arrays are unsuitable. First, consider a standard linked list structure.
We need the linked list to be sorted by score, meaning we must locate the insertion point for new elements to maintain order. Typically, a binary search is used to find this point, but binary searches require arrays, not linked lists. What to do then?
Imagine a startup: initially, there are only a few people, all equal co-founders. As the company grows, communication costs rise, leading to the introduction of team leaders to divide the teams. During meetings, teams convene separately, and leaders hold their own meetings. As the company expands further, another layer—departments—is added, with representatives from teams selected as department heads. These heads will have their own high-level meetings.
The skip list is similar to this hierarchical system, with all elements in the lowest layer linked together. Every few elements, a representative is selected to connect to the next level, and this process continues up through the levels, creating a pyramid structure.
Consider the location of your hometown on a world map: Asia → China → Anhui Province → Anqing City → Zongyang County → Tanggou Town → Tianjian Village → xxxx No. This structure resembles the hierarchy.
The term "skip list" reflects that elements may belong to multiple levels. For instance, an element may occupy levels L0, L1, and L2, allowing for quick "jumps" between different levels.
To locate an insertion point, searching starts at the top level and descends down through levels until the suitable position is found. You might wonder how a newly inserted element can hold positions in multiple levels.
The skip list employs a random strategy to determine how many levels a new element can occupy.
L0 is guaranteed at 100%, L1 has a 50% chance, L2 a 25% chance, L3 a 12.5% chance, and so on, all the way to the top level, L31. Most elements will occupy only a few levels, while a rare few may reach the top. The more elements in the list, the deeper levels become accessible, increasing the probability of reaching the top.
Universal Rules for Container Data Structures
The four container-type data structures—list, set, hash, and zset—share two universal rules:
- Create If Not Exists : If a container does not exist, Redis will create one automatically before performing operations. For instance, during the
rpush
operation, if the list is nonexistent, Redis creates it and then adds the new element. - Drop If No Elements : If a container has no elements left, it will be deleted immediately, freeing up memory. This means that after the last element is popped using
lpop
, the list will disappear.
Expiration Time
All Redis data structures can have expiration times set, leading to automatic deletion of the corresponding objects when the time elapses. Note that expiration applies at the object level—for example, the expiration of a hash structure affects the entire hash, not just individual sub-keys.
A crucial point to remember is that if a string with an expiration time is modified using the set
method, the expiration time will be removed.
plaintext
127.0.0.1:6379> set codehole yoyo
OK
127.0.0.1:6379> expire codehole 600
(integer) 1
127.0.0.1:6379> ttl codehole
(integer) 597
127.0.0.1:6379> set codehole yoyo
OK
127.0.0.1:6379> ttl codehole
(integer) -1
Reflection & Assignment
- For Java Users: Define a user information structure, then serialize and deserialize it using Fastjson. Use Jedis for storing and retrieving user information in Redis.
- For Python Users: Utilize the built-in JSON package, and use redis-py for storing and retrieving user information.
- Considerations: How would you encapsulate user information using a hash structure?
- Other Commands: Think about other commands you've used that weren't mentioned in this section.
- Data Structures in Community Modules: Reflect on which data structures are used across various functionalities in the community module.