Stats from a Western Digital SN750, rated at 3400 MB/s sequential read, 2900 MB/s sequential write. Only reading with io_uring comes close to using the full bandwidth of the SSD on read, while still being fairly slow on write.
Fsync also seems to have a lot of overhead when using
write()
and aio_write()
(causing a 3x
slowdown), but far less overhead with io_uring
(20%
overhead on sequential writes, 50% overhead on random writes).
On the same SSD: ~660μs per fsync, so 1380 fsyncs/second.
On an external SSD, closer to 460 fsyncs/second, or about ~2ms per fsync, so 3x slower.
./sqlite-bench -batch-count 1000 -batch-size 1000 -row-size 1000 -journal-mode wal -synchronous normal ./bench.db
Inserts: 1000000 rows
Elapsed: 7.824s
Rate: 127817.265 insert/sec
File size: 1026584576 bytes
./sqlite-bench -batch-count 1000000 -batch-size 1 -row-size 1000 -journal-mode wal -synchronous normal ./bench.db
Inserts: 1000000 rows
Elapsed: 43.839s
Rate: 22810.910 insert/sec
File size: 1026584576 bytes
./sqlite-bench -batch-count 1000 -batch-size 1000 -row-size 1000 -journal-mode wal -synchronous off ./bench.db
Inserts: 1000000 rows
Elapsed: 4.752s
Rate: 210432.225 insert/sec
File size: 1026584576 bytes
./sqlite-bench -batch-count 1000000 -batch-size 1 -row-size 1000 -journal-mode wal -synchronous off ./bench.db
Inserts: 1000000 rows
Elapsed: 28.624s
Rate: 34936.319 insert/sec
File size: 1026584576 bytes
Load 10M keys sequentially into database
Set seed to 1684413524204146 because --seed was 0
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
Integrated BlobDB: blob cache disabled
Keys: 16 bytes each (+ 0 bytes user-defined timestamp)
Values: 800 bytes each (400 bytes after compression)
Entries: 10000000
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 7782.0 MB (estimated)
FileSize: 3967.3 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: SkipListFactory
Perf Level: 1
------------------------------------------------
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
Integrated BlobDB: blob cache disabled
DB path: [./db]
fillseq : 5.288 micros/op 189094 ops/sec 52.883 seconds 10000000 operations; 147.2 MB/s
Microseconds per write:
Count: 10000000 Average: 5.2883 StdDev: 128.83
Min: 4 Median: 4.7066 Max: 406752
Percentiles: P50: 4.71 P75: 5.57 P99: 9.89 P99.9: 24.10 P99.99: 49.45
Reading 10M keys in database in random order
Set seed to 1684412888948042 because --seed was 0
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
Integrated BlobDB: blob cache disabled
Keys: 16 bytes each (+ 0 bytes user-defined timestamp)
Values: 800 bytes each (400 bytes after compression)
Entries: 10000000
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 7782.0 MB (estimated)
FileSize: 3967.3 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: SkipListFactory
Perf Level: 1
------------------------------------------------
DB path: [./db]
readrandom : 4.215 micros/op 3779991 ops/sec 42.328 seconds 160000000 operations; 294.2 MB/s (1000335 of 10000000 found)
Microseconds per read:
Count: 160000000 Average: 4.2148 StdDev: 21.70
Min: 0 Median: 0.7457 Max: 44209
Percentiles: P50: 0.75 P75: 1.50 P99: 44.33 P99.9: 74.35 P99.99: 329.00
Overwriting 10M keys in database in random order
Set seed to 1684413619666146 because --seed was 0
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
Integrated BlobDB: blob cache disabled
Keys: 16 bytes each (+ 0 bytes user-defined timestamp)
Values: 800 bytes each (400 bytes after compression)
Entries: 10000000
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 7782.0 MB (estimated)
FileSize: 3967.3 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: SkipListFactory
Perf Level: 1
------------------------------------------------
DB path: [./db]
overwrite : 8.553 micros/op 116913 ops/sec 85.534 seconds 10000000 operations; 91.0 MB/s
Microseconds per write:
Count: 10000000 Average: 8.5534 StdDev: 2.73
Min: 4 Median: 8.1227 Max: 4990
Percentiles: P50: 8.12 P75: 9.34 P99: 14.97 P99.9: 30.69 P99.99: 50.85
Read while writing 1M keys in database in random order
Set seed to 1684411784145181 because --seed was 0
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
Integrated BlobDB: blob cache disabled
RocksDB: version 7.8.3
Date: Thu May 18 08:09:44 2023
CPU: 16 * 12th Gen Intel(R) Core(TM) i5-1240P
CPUCache: 12288 KB
Keys: 16 bytes each (+ 0 bytes user-defined timestamp)
Values: 800 bytes each (800 bytes after compression)
Entries: 1000000
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 778.2 MB (estimated)
FileSize: 778.2 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: SkipListFactory
Perf Level: 1
------------------------------------------------
DB path: [./db]
readwhilewriting : 20.477 micros/op 769319 ops/sec 20.798 seconds 16000000 operations; 598.7 MB/s (1000000 of 1000000 found)
Microseconds per read:
Count: 16000000 Average: 20.4782 StdDev: 109.17
Min: 1 Median: 13.5473 Max: 40204
Percentiles: P50: 13.55 P75: 18.12 P99: 187.27 P99.9: 814.39 P99.99: 5044.78
~200k RPS for an HTTP server.
Running 5s test @ http://localhost:3000
256 goroutine(s) running concurrently
975712 requests in 4.929485946s, 105.15MB read
Requests/sec: 197933.82
Transfer/sec: 21.33MB
Avg Req Time: 1.293361ms
Fastest Request: 21.095µs
Slowest Request: 67.116949ms
Number of Errors: 0
~200k RPS for insert/set/delete.
====== PING_INLINE ======
100000 requests completed in 0.54 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 186915.88 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.142 0.040 0.135 0.231 0.375 1.415
====== PING_MBULK ======
100000 requests completed in 0.52 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 193798.45 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.141 0.032 0.127 0.263 0.391 1.655
====== SET ======
100000 requests completed in 0.53 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 187265.92 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.141 0.032 0.135 0.207 0.335 1.055
====== GET ======
100000 requests completed in 0.54 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 184501.84 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.143 0.032 0.135 0.239 0.367 1.575
====== INCR ======
100000 requests completed in 0.51 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 194931.77 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.140 0.032 0.127 0.255 0.399 1.815
====== LPUSH ======
100000 requests completed in 0.51 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 194174.77 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.135 0.072 0.127 0.207 0.319 1.095
====== RPUSH ======
100000 requests completed in 0.52 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 193423.59 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.140 0.040 0.127 0.263 0.383 3.407
====== LPOP ======
100000 requests completed in 0.51 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 194931.77 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.136 0.064 0.127 0.199 0.367 1.527
====== RPOP ======
100000 requests completed in 0.51 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 194174.77 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.137 0.032 0.127 0.207 0.359 2.055
====== SADD ======
100000 requests completed in 0.52 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 190839.70 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.137 0.064 0.135 0.199 0.343 0.967
====== HSET ======
100000 requests completed in 0.51 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 194552.53 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.137 0.040 0.127 0.215 0.359 0.999
====== SPOP ======
100000 requests completed in 0.52 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 191570.88 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.140 0.032 0.135 0.247 0.359 1.199
====== ZADD ======
100000 requests completed in 0.55 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 183486.23 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.145 0.040 0.135 0.231 0.407 1.191
====== ZPOPMIN ======
100000 requests completed in 0.53 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 189753.31 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.141 0.032 0.127 0.255 0.367 1.279
====== LPUSH (needed to benchmark LRANGE) ======
100000 requests completed in 0.52 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 192307.70 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.140 0.032 0.127 0.239 0.375 2.647
====== LRANGE_100 (first 100 elements) ======
100000 requests completed in 0.90 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 110864.74 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.235 0.112 0.231 0.335 0.495 1.759
====== LRANGE_300 (first 300 elements) ======
100000 requests completed in 2.40 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 41666.66 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.609 0.200 0.599 0.767 1.023 3.127
====== LRANGE_500 (first 500 elements) ======
100000 requests completed in 3.58 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 27932.96 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.911 0.168 0.887 1.135 1.575 4.071
====== LRANGE_600 (first 600 elements) ======
100000 requests completed in 4.11 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 24360.54 requests per second
latency summary (msec):
avg min p50 p95 p99 max
1.038 0.216 1.015 1.279 1.839 4.575
====== MSET (10 keys) ======
100000 requests completed in 0.50 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 3600 1 300 100 60 10000
host configuration "appendonly": no
multi-thread: no
Summary:
throughput summary: 198019.80 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.161 0.040 0.143 0.271 0.391 1.495
Caches:
Type | Size | Latency | Throughput |
---|---|---|---|
L1 | 64KB | 3 cycles | 4TB/s |
L2 | 9MB | 10 cycles | 1.5TB/s |
L3 | 12MB | 40 cycles | 400GB/s |
RAM | 32GB | 200 cycles | 25GB/s |
SSD | 2TB | 3000 cycles | 5GB/s |
HDD | 16TB | 1.5M cycles | 200MB/s |