3.6. Team Compaction

DB options for tunning

optionsdefaultExplanation
leveldb.write_buffer_size2MBsingle memtable size
leveldb.max_file_size4MBsstable size
leveldb.compressionsnappycompression method
leveldb.cache_size4Mcache size
leveldb.filter_bits10number of filter block bits
leveldb.block_size4KBsize of data block in one file
leveldb.block_restart_interval16number of keys between restart points(delta encoding)

Analyze workloads

img
A: Read/update ratio: 50/50
B: Read/update ratio: 95/5
D: Read/update/insert ratio: 95/0/5
Because of the high read proportion, consideration is given to ways to maximize read performance.
Of course, consider write performance and choose the best option.

Hypothesis and experiment

Considering that it is run at random, it is measured with an average of 3 times.

Hypothesis

Based on what was studied in the study, it is expected that writing performance will increase as the size of the buffer increases the amount of writing at once.
In addition, it is expected that read performance will improve if the size of the file is 1/4 of the size of the buffer, allowing four files, which are the thresholds of level0, to be read at once, and increasing the cache size.
Finally, by reducing the size of the block, increasing the number of bits in the index of the file, and increasing the number of bits in the filter block, the read performance is expected to be improved

Default

workloadruntime(sec)throughput(ops/sec)
load6.2213516073.7
A2.8819934698.2
B0.753103132784
D0.545697183252

write_buffer_size 8MB

workloadruntime(sec)throughput(ops/sec)->runtime(sec)throughput(ops/sec)
load6.2213516073.7->4.5835121817.4
A2.8819934698.2->2.3062543360.4
B0.753103132784->0.738974135323
D0.545697183252->0.50175199302

As the write performance improved, the performance of load and A improved.

write_buffer_size 32MB, max_file_size 8MB

workloadruntime(sec)throughput(ops/sec)->runtime(sec)throughput(ops/sec)
load6.2213516073.7->1.3394874655.9
A2.8819934698.2->1.4434969276.5
B0.753103132784->0.708532141137
D0.545697183252->0.502773198897

The buffer size was further increased to 32MB(no performance changes after increasing more than 32MB), and the file size was 8MB, so that only Level 0 could be filled first, and it could be seen that the write performance was improved.

cacahe_size 8MB, block_size 2KB

workloadruntime(sec)throughput(ops/sec)->runtime(sec)throughput(ops/sec)
load6.2213516073.7->1.3305475157.5
A2.8819934698.2->1.3831772297.5
B0.753103132784->0.690946144729
D0.545697183252->0.484127206557

For read performance, the cache size was increased and the block size was reduced. (More cache than 8MB or less block size makes little difference in performance)
It was found that the performance of workload B and C, which had a high reading proportion, improved slightly.
Here, increasing the filter block and changing block_restart_interval does not cause any worse performance or make any difference.

Therefore, the best options

the best options
leveldb.write_buffer_size32MB
leveldb.max_file_size8MB
leveldb.compressionsnappy
leveldb.cache_size8MB
leveldb.filter_bits10
leveldb.block_size2KB
leveldb.block_restart_interval16

Conclusion and discussion

We selected the best option by properly increasing the buffer size, file size, and cache size to 32MB, 8MB, and 8MB, respectively, and the block size to half 2KB.
It was clear that writing performance improved. However, it was predicted that read performance would improve if the cache size was increased to increase the hit rate and the block size was reduced to increase the items included in the index, but read performance did not improve significantly. There may be several factors for this reason, but the unpredictable factor is believed to be the main cause because data is accessed randomly.

Using one-third of the memory as a cache is good in terms of tradeoff, which can leave a large amount of OS page cache, so it is expensive to avoid memory budgeting, but in terms of performance, it is expected that the performance will be better if the size is increased by utilizing the remaining cache well.