Redis' slowlog and DEL's surprisingly bad performance
Redis has a useful slowlog facility which, as you can probably guess, lets you inspect slow running queries. Slowlog will log the last X queries that took more than Y microseconds to run. X and Y can be specified in either redis.conf
or, at runtime, via the CONFIG
command:
CONFIG SET slowlog-log-slower-than 5000
CONFIG SET slowlog-max-len 25
The slowlog-log-slower-than
is specified in microseconds, so the above will log queries that take over 5 milliseconds to execute. To get the log entries, you use the SLOWLOG GET X
command, where X is the number of items you want to retrieve:
SLOWLOG GET 10
This'll display a unique id, the timestamp the query took place, the amount of time the query took to execute and the actual command + parameters that were executed. You can erase the log via SLOWLOG RESET
.
The last time that I looked at our slowlog, I was shocked to see that we had DEL
commands taking over 20 milliseconds to execute. Remember, Redis is single-threaded, so this blocks (and consequently seriously harms) our system's concurrency. Also, since this is a write operation, it'll block all slaves while it replicates to them. Ugh. What could be going on?
Maybe everyone already knows this but me, but it turns out Redis' DEL
command is O(1) for strings and hashes, but O(N) for lists, sets and sorted sets (where N is the number of items in the collection). Deleting a set containing millions of items? Prepare to block.
Our solution was simple: rather than deleting these items, we'd rename them and delete them in a background job in small and interruptible chunks. First, our delayed_delete
function:
local key = KEYS[1]
local data_type = redis.call('type', key).ok
if data_type == 'set' or data_type == 'zset' then
local temp = 'gc:tmp:' .. redis.call('incr', 'gc:ids') .. ':' .. key
redis.call('rename', key, temp)
return redis.call('sadd', 'gc:' .. data_type, temp)
end
return redis.call('del', key)
This'll rename the collection and add that new name to the gc:set
or gc:zset
sets (we don't use lists, but if you do, you should add support for them as well).
Next we scheduled a Ruby script to run every minute:
require 'redis'
r = Redis.new(driver: :hiredis)
r.srandmember('gc:set', 10000).each do |set|
items = r.srandmember(set, 5000)
if items.nil? || items.length == 0
r.srem('gc:set', set)
next
end
r.srem(set, items)
end
r.srandmember('gc:zset', 10000).each do |zset|
if r.zremrangebyrank(zset, 0, 5000) < 5000
r.srem('gc:zset', zset)
end
end
You'll want to play with the numbers based on your own needs. How big are your collections and how often do they get deleted? Since we don't do these types of deletes too frequently, we can go about deleting small chunks at a time.
While this approach is much slower than a straight delete, it plays much nicer in a concurrent world.