I noticed a counter-intuitive phenomenon as I tweaked the coalescing
code: the more coalescing we did, the larger the tdb grew! This was
measured using "growtdb-bench 250000 10".
The cause: more coalescing means larger transactions, and every time
we do a larger transaction, we need to allocate a larger recovery
area. The only way to do this is to append to the file, so the file
keeps growing, even though it's mainly unused!
Overallocating by 25% seems reasonable, and gives better results in
such benchmarks.
The real fix is to reduce the transaction to a run-length based format
rather then the naive block system used now.
Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction
2000000
real 0m57.403s
user 0m11.361s
sys 0m4.056s
-rw------- 1 rusty rusty
689536976 2011-04-27 21:10 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK
real 1m24.901s
user 0m0.380s
sys 0m0.512s
-rw------- 1 rusty rusty 655368 2011-04-27 21:12 torture.tdb
Adding
2000000 records: 941 ns (
110551992 bytes)
Finding
2000000 records: 603 ns (
110551992 bytes)
Missing
2000000 records: 428 ns (
110551992 bytes)
Traversing
2000000 records: 416 ns (
110551992 bytes)
Deleting
2000000 records: 741 ns (
199517112 bytes)
Re-adding
2000000 records: 819 ns (
199517112 bytes)
Appending
2000000 records: 1228 ns (
376542552 bytes)
Churning
2000000 records: 2042 ns (
553641304 bytes)
After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction
2000000
real 0m59.687s
user 0m11.593s
sys 0m4.100s
-rw------- 1 rusty rusty
752004064 2011-04-27 21:14 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK
real 1m17.738s
user 0m0.348s
sys 0m0.580s
-rw------- 1 rusty rusty 663360 2011-04-27 21:15 torture.tdb
Adding
2000000 records: 926 ns (
110556088 bytes)
Finding
2000000 records: 592 ns (
110556088 bytes)
Missing
2000000 records: 416 ns (
110556088 bytes)
Traversing
2000000 records: 422 ns (
110556088 bytes)
Deleting
2000000 records: 741 ns (
244003768 bytes)
Re-adding
2000000 records: 799 ns (
244003768 bytes)
Appending
2000000 records: 1147 ns (
295244592 bytes)
Churning
2000000 records: 1827 ns (
568411440 bytes)