Rusty Russell [Thu, 26 Aug 2010 14:38:34 +0000 (00:08 +0930)]
tdb2: more fixes and tests for enlarging hash.
- Neaten I/O function
- Don't use fill in zero_out: it's only for low-level ops.
- Don't mangle arg in tdb_write_convert: it broke write_header.
- More use of tdb_access_read, make it optionally converting.
- Rename unlock_range to unlock_lists.
- Lots of fixes to enlarge_hash now it's being tested.
- More expansion cases tested.
Rusty Russell [Tue, 15 Jun 2010 10:02:55 +0000 (19:32 +0930)]
typesafe_cb: expose _exact and _def variants.
We can't allow NULL with the new variant (needed by talloc's set_destructor
for example), so document that and expose all three variants for different
uses.
Rusty Russell [Fri, 11 Jun 2010 03:36:40 +0000 (13:06 +0930)]
typesafe_cb: fix promotable types being incorrectly accepted by cast_if_type.
cast_if_type() should not try to degrade the expression using 1?(test):0,
as that promotes bool to int, as well as degrading functions to function
pointers: it should be done by the callers.
Rusty Russell [Wed, 9 Jun 2010 14:33:04 +0000 (00:03 +0930)]
alloc: reduce page header further, go down to 64k minimum.
This means we can't have more than 2^25 elements per page; that's
a maximum small page size of about 2^24 (with >8 objects per small page
we move to large pages), meaning a poolsize max of 4G.
We have a tighter limit at the moment anyway, but we should remove it
once we fix this. In particular count all-zero and all-one words in
the used field (that's what we care about: full or empty) would give us
another factor of 64 (we only care about larger pool sizes on 64-bit
platforms).
We can also restore the larger number of pages and greater inter-page
spacing once we implement the alternative tiny allocator.
Rusty Russell [Fri, 9 Apr 2010 01:28:21 +0000 (10:58 +0930)]
From: Joseph Adams <joeyadams3.14159@gmail.com>
The ccanlint patch is rather intrusive. First, it adds a new field to
all the ccanlint tests, "key". key is a shorter, still unique
description of the test (e.g. "valgrind"). The names I chose as keys
for all the tests are somewhat arbitrary and often don't reflect the
name of the .c source file (because some of those names are just too
darn long). Second, it adds two new options to ccanlint:
It also adds a consistency check making sure all tests have unique
keys and names.
The primary goal of the ccanlint patch was so I could exclude the
valgrind test, which takes a really long time for some modules (I
think btree takes the longest, at around 2 minutes). I'm not sure I
did it 100% correctly, so you'll want to review it first.
Rusty Russell [Fri, 9 Apr 2010 01:24:31 +0000 (10:54 +0930)]
From: Joseph Adams <joeyadams3.14159@gmail.com>
The btree patch gives the btree module an intuitive frontend
(btree_insert, btree_remove, btree_lookup) and a built-in ordering
function for strings. Together, these make it easy to use the btree
module as a dynamic string map.
Rusty Russell [Wed, 24 Feb 2010 03:38:40 +0000 (14:08 +1030)]
tdb: handle processes dying during transaction commit.
tdb transactions were designed to be robust against the machine
powering off, but interestingly were never designed to handle the case
where an administrator kill -9's a process during commit. Because
recovery is only done on tdb_open, processes with the tdb already
mapped will simply use it despite it being corrupt and needing
recovery.
The solution to this is to check for recovery every time we grab a
data lock: we could have gained the lock because a process just died.
This has no measurable cost: here is the time for tdbtorture -s 0 -n 1
-l 10000:
Rusty Russell [Wed, 24 Feb 2010 03:24:31 +0000 (13:54 +1030)]
tdb: remove lock ops
Now the transaction code uses the standard allrecord lock, that stops
us from trying to grab any per-record locks anyway. We don't need to
have special noop lock ops for transactions.
This is a nice simplification: if you see brlock, you know it's really
going to grab a lock.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Tue, 23 Feb 2010 22:22:44 +0000 (08:52 +1030)]
tdb: suppress record write locks when allrecord lock is taken.
Records themselves get (read) locked by the traversal code against delete.
Interestingly, this locking isn't done when the allrecord lock has been
taken, though the allrecord lock until recently didn't cover the actual
records (it now goes to end of file).
The write record lock, grabbed by the delete code, is not suppressed by
the allrecord lock, which causes us to punch a hole in that lock when we
release the write record lock. Make this consistent: *no* record locks
of any kind when the allrecord lock is taken.
Rusty Russell [Mon, 22 Feb 2010 12:33:36 +0000 (23:03 +1030)]
tdb: don't reduce file size on transaction recovery.
There's little point in ever shrinking the file, and it definitely breaks in the case where a process has died during a transaction commit and other processes have the tdb mapped.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If a process (or the machine) dies after just after writing the
recovery head (pointing at the end of file), the recovery record will filled
with 0x42. This will not invoke a recovery on open, since rec.magic
!= TDB_RECOVERY_MAGIC.
Unfortunately, the first transaction commit will happily reuse that
area: tdb_recovery_allocate() doesn't check the magic. The recovery
record has length 0x42424242, and it writes that back into the
now-valid-looking transaction header) for the next comer (which
happens to be tdb_wipe_all in my tests).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Feb 2010 04:33:23 +0000 (15:03 +1030)]
tdb: cleanup: remove ltype argument from _tdb_transaction_cancel.
Now the transaction allrecord lock the standard one, and thus is cleaned
in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to
know what type it is.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Centralize locking of all chains of the tdb; rename _tdb_lockall to
tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and
tdb_brlock_upgrade to tdb_allrecord_upgrade.
Then we use this in the transaction code. Unfortunately, if the transaction
code records that it has grabbed the allrecord lock read-only, write locks
will fail, so we treat this upgradable lock as a write lock, and mark it
as upgradable using the otherwise-unused offset field.
One subtlety: now the transaction code is using the allrecord_lock, the
tdb_release_extra_locks() function drops it for us, so we no longer need
to do it manually in _tdb_transaction_cancel.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 22 Feb 2010 03:57:13 +0000 (14:27 +1030)]
tdb: cleanup: always grab allrecord lock to infinity.
We were previously inconsistent with our "global" lock: the
transaction code grabbed it from FREELIST_TOP to end of file, and the
rest of the code grabbed it from FREELIST_TOP to end of the hash
chains. Change it to always grab to end of file for simplicity.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>