Skip to content

Small Object Compression

Redis is a memory-intensive database that stores all its data in memory. If we don't pay attention to memory usage, Redis may crash due to insufficient memory. The author of Redis has implemented numerous optimizations to reduce memory usage, often at the expense of code readability, which is undoubtedly worthwhile for a database like Redis.

32-bit vs 64-bit

If Redis is compiled using 32-bit, the pointer space used by internal data structures will be halved. If your memory usage in Redis does not exceed 4GB, consider compiling with 32-bit to save a significant amount of memory. A 4GB capacity is sufficient for small sites acting as caching databases; if more is needed, you can increase the number of instances.

Small Object Compression Storage (ziplist)

When the collection data structures managed internally by Redis are small, they will be stored using a compact storage format.

This is similar to how a HashMap can waste space with a two-dimensional structure if it contains few elements. Instead, a one-dimensional array can be used for storage, which allows for fast traversal. For example, we can simulate the add, delete, and modify operations of a HashMap using an array:

java
public class ArrayMap<K, V> {
  private List<K> keys = new ArrayList<>();
  private List<V> values = new ArrayList<>();

  public V put(K k, V v) {
    for (int i = 0; i < keys.size(); i++) {
      if (keys.get(i).equals(k)) {
        V oldv = values.get(i);
        values.set(i, v);
        return oldv;
      }
    }
    keys.add(k);
    values.add(v);
    return null;
  }

  public V get(K k) {
    for (int i = 0; i < keys.size(); i++) {
      if (keys.get(i).equals(k)) {
        return values.get(i);
      }
    }
    return null;
  }

  public V delete(K k) {
    for (int i = 0; i < keys.size(); i++) {
      if (keys.get(i).equals(k)) {
        keys.remove(i);
        return values.remove(i);
      }
    }
    return null;
  }
}

Redis’s ziplist is a compact byte array structure where each element is stored next to each other. We don’t need to worry too much about zlbytes, zltail, and zlend; just understanding them conceptually is enough.

If it stores a hash structure, the key and value will be stored as two adjacent entries:

shell
127.0.0.1:6379> hset hello a 1
(integer) 1
127.0.0.1:6379> hset hello b 2
(integer) 1
127.0.0.1:6379> hset hello c 3
(integer) 1
127.0.0.1:6379> object encoding hello
"ziplist"

If it stores a zset, the value and score will be adjacent:

shell
127.0.0.1:6379> zadd world 1 a
(integer) 1
127.0.0.1:6379> zadd world 2 b
(integer) 1
127.0.0.1:6379> zadd world 3 c
(integer) 1
127.0.0.1:6379> object encoding world
"ziplist"

Intset

Redis's intset is a compact integer array structure used to store small sets of integers.

If the integers can be represented with uint16, the intset will use a 16-bit array. If the new integer exceeds the range of uint16, it will switch to uint32, and if it exceeds the range of uint32, it will switch to uint64. Redis supports dynamic upgrading of the set structure.

shell
127.0.0.1:6379> sadd hello 1 2 3
(integer) 3
127.0.0.1:6379> object encoding hello
"intset"

If the set stores strings, the sadd command will immediately switch to a hashtable structure, similar to how Java’s HashSet is implemented using HashMap.

shell
127.0.0.1:6379> sadd hello yes no
(integer) 2
127.0.0.1:6379> object encoding hello
"hashtable"

Storage Limits

As the number of elements in a collection grows, or if a value becomes too large, these small object storage formats will upgrade to standard structures. The limits are as follows:

  • hash-max-ziplist-entries 512 # For hash, if the number of elements exceeds 512, standard structure is required.
  • hash-max-ziplist-value 64 # For hash, if any key/value length exceeds 64, standard structure is required.
  • list-max-ziplist-entries 512 # For list, if the number of elements exceeds 512, standard structure is required.
  • list-max-ziplist-value 64 # For list, if any element length exceeds 64, standard structure is required.
  • zset-max-ziplist-entries 128 # For zset, if the number of elements exceeds 128, standard structure is required.
  • zset-max-ziplist-value 64 # For zset, if any element length exceeds 64, standard structure is required.
  • set-max-intset-entries 512 # For set, if the number of integer elements exceeds 512, standard structure is required.

Let’s conduct a small experiment to verify these limits.

python
import redis
client = redis.StrictRedis()
client.delete("hello")
for i in range(512):
    client.hset("hello", str(i), str(i))
print(client.object("encoding", "hello"))  # Get object storage structure
client.hset("hello", "512", "512")
print(client.object("encoding", "hello"))  # Get object storage structure again

Output:

ziplist
hashtable

This shows that when the number of elements in the hash structure exceeds 512, the storage structure changes.

Next, we’ll try increasing the length of the values. In Python, multiplying a string by an integer repeats it n times.

python
import redis
client = redis.StrictRedis()
client.delete("hello")
for i in range(64):
    client.hset("hello", str(i), "0" * (i + 1))
print(client.object("encoding", "hello"))  # Get object storage structure
client.hset("hello", "512", "0" * 65)
print(client.object("encoding", "hello"))  # Get object storage structure again

Output:

ziplist
hashtable

This indicates that when any entry's value exceeds 64, the storage structure upgrades to a standard structure.

Memory Reclamation Mechanism

Redis does not immediately return freed memory to the operating system.

If Redis currently has 10GB of memory and you delete 1GB of keys, the memory usage will not significantly change. This is because the operating system reclaims memory in pages. If any key is still in use on a page, it cannot be reclaimed. Redis may have deleted 1GB of keys, but they are scattered across many pages, each containing other keys, preventing immediate reclamation.

However, if you execute flushdb, you will see that memory has been reclaimed, as all keys are removed, leaving most previously used pages clean and ready for immediate recovery by the operating system.

While Redis cannot guarantee immediate reclamation of memory from deleted keys, it does reuse the free memory that has not yet been reclaimed. This is similar to how a cinema retains seats after patrons leave; the next group can directly use those seats without needing to move them.

Memory Allocation Algorithm

Memory allocation is a complex topic that involves appropriate algorithms for page allocation, fragmentation management, and balancing performance with efficiency.

To maintain its simplicity, Redis delegates the details of memory allocation to third-party libraries. Currently, Redis can use the jemalloc (by Facebook) library for memory management or switch to tcmalloc (by Google). Because jemalloc offers slightly better performance than tcmalloc, Redis defaults to using jemalloc.

shell
127.0.0.1:6379> info memory
# Memory
used_memory:809608
used_memory_human:790.63K
used_memory_rss:8232960
used_memory_peak:566296608
used_memory_peak_human:540.06M
used_memory_lua:36864
mem_fragmentation_ratio:10.17
mem_allocator:jemalloc-3.6.0

By using the info memory command, you can see that Redis is using jemalloc for memory allocation.

Small Object Compression has loaded