This greatly reduces the fragmentation of databases where records
tend to grow slowly by a small amount each time. The case where this
is most seen is the ldb index records. Adding this overallocation
reduced the size of the resulting database by more than 20x when
running a test that adds 10k users.
The idea behind this is to recover from badly fragmented free
lists. Choosing the point where the file expands is fairly arbitrary,
but seems to work well.
The tdb_repack() function repacks a TDB so that it has a single
freelist entry. The file doesn't shrink, but it does remove all
freelist fragmentation. This code originated in the CTDB vacuuming
code, but will now be used in ldb to cope with fragmentation from
re-indexing
make TDB_NOSYNC affect all the fsync/msync calls in transactions
During a transaction commit tdb normally uses fsync/msync calls to
make it crash safe. This can be disabled using the TDB_NOSYNC flag,
but it wasn't disabling all the code paths that caused a fsync/msync.
Using tdb_transaction_prepare_commit() gives us 2-phase commits. This
allows us to safely commit across multiple tdb databases at once, with
reasonable transaction semantics
Rusty Russell [Fri, 17 Jul 2009 07:29:58 +0000 (16:59 +0930)]
Cleanup variable names.
struct op: serial -> seqnum (and drop use of term "serial" everywhere).
struct op: op -> type
struct op_desc: general replacement for file/op_num pair.
Change add_dependency to take two const op_desc *: means reshuffle since we can't adjust the args any more.
Rusty Russell [Thu, 16 Jul 2009 01:58:33 +0000 (11:28 +0930)]
Implement timeout for the deadlock of traverse & transactions.
This has proven to be intractible: various attempts to eliminate have failed, so detect at runtime and cease the traversal (and do the remaining ops outside a traverse).
Rusty Russell [Wed, 15 Jul 2009 05:27:49 +0000 (14:57 +0930)]
Handle transactions!
Note: we can still deadlock on traversal vs transaction corner cases.
We handle transactions as single operation, which it logically is.
Rusty Russell [Wed, 15 Jul 2009 03:49:35 +0000 (13:19 +0930)]
Insert (implied) transaction cancel on tdb_close/EOF.
Also changes first member of transaction to have valid start_group field,
and fix outdated comment.
Rusty Russell [Tue, 14 Jul 2009 12:34:36 +0000 (22:04 +0930)]
Fix early transaction unlock when traverse done inside transaction.
Generalizes traverse in traverse fix from rusty@rustcorp.com.au-20090629073630-3eduhyypx2tp6u80
Rusty Russell [Mon, 13 Jul 2009 06:19:52 +0000 (15:49 +0930)]
More general solution for serial number misorders.
Make sort_deps more efficient, and also only alter order when necessary. This means by default we run in serial number order, only going outside when we detect a dependency.
Maintain trace file order in original sort, so sort_deps doesn't mess it up.
We still need serial numbers: sort_deps can have multiple solutions for a single key, but these may deadlock with the ordering requirements of other keys. By sticking close to the actual order (ie. serial order), we minimize the chance of this happening.
Rusty Russell [Mon, 13 Jul 2009 02:14:40 +0000 (11:44 +0930)]
Optimize to reduce extraneous dependencies.
In my tdbtorture -n 4 example trace, this reduces from 14493 to 3210 dependencies, but doesn't make any measurable improvement in the time. Still, it's simple to do and might make a difference for larger sets.
Rusty Russell [Mon, 13 Jul 2009 01:48:05 +0000 (11:18 +0930)]
Simplify dependencies by passing pointers over the pipe: avoid O(n^2) behaviour for searching.
Also, using a single structure makes talloc_free convenient for when we do optimization.
Rusty Russell [Sun, 12 Jul 2009 12:48:09 +0000 (22:18 +0930)]
Get more sophisticated with resolving duplicate serial numbers.
Still deadlocks in one case, due to spurious dependencies inside traversals. See next commit.
Rusty Russell [Thu, 28 May 2009 03:56:51 +0000 (13:26 +0930)]
From: Joseph Adams <joeyadams3.14159@gmail.com>
I have given my array module a makeover (see attached
array-0.1.tar.bz2 ). Major changes are:
* All the macros have been renamed to flat array_* names. Instead of
Array, AInit, AAppend, etc., it is now array, array_init,
array_append, etc.. This will obviously break any applications
already using the array module (if any); 'renames' is the list of sed
commands I used to make the name changes.
* array (by default) now uses talloc functions instead of regular
malloc/realloc/free.
* All of the array macros have tests now.