-
Notifications
You must be signed in to change notification settings - Fork 175
Description
A bitcask file is not deleted immediately after a merge, but handed off to bitcask_merge_delete
for a deferred delete. In the meantime it is marked by turning on its setuid
bit.
Calling bitcask:merge/1
1 merges readable files, which list does not include those marked for deletion2.
However, the riak_kv_bitcask_backend
starts a merge by specifying the exact list of files to merge. And that list comes from bitcask:needs_merge/1
3 that does not filter out files marked for a deletion.
It means that if a file is not deleted in 3 minutes after it's merged, it will be enlisted for the next merge too. This issue shouldn't happen too often, but is magnified by the delay of all merging until the merging window, when a huge amount of merging activity can suddenly begin. Also, bitcask_merge_delete
is a per node (not per vnode) process, so a long running fold operation in any of the vnodes (e.g. MDC replication) may block the deferred delete queue for a long time.
The result is an unnecessary use of disk IO and CPU, but there is no other risk e.g. no chance for data loss.