git.ozlabs.org Git

tdb2: fix intermittant failure in run-50-multiple-freelists-fail.c

layout.c's TDB creation functions were incorrect in case of a hash
collision, causing occasional failure. Make it always use the
(previously-failing) seed value, and fix it.

commit | commitdiff | tree

Joey Adams [Sat, 20 Aug 2011 02:29:44 +0000 (22:29 -0400)]

bdelta: new module for binary diff/patch

commit | commitdiff | tree

Rusty Russell [Tue, 16 Aug 2011 06:04:14 +0000 (15:34 +0930)]

block_pool, ccan_tokenizer, stringmap: add ccanlint license suppressions.

commit | commitdiff | tree

Rusty Russell [Tue, 16 Aug 2011 05:47:54 +0000 (15:17 +0930)]

array_size: relicense under public domain.

It's just a header, I don't care what's done with it.

commit | commitdiff | tree

Rusty Russell [Tue, 16 Aug 2011 05:43:37 +0000 (15:13 +0930)]

check_type: remove erroneous license line (it's now public domain)

commit | commitdiff | tree

Joey Adams [Mon, 15 Aug 2011 06:45:22 +0000 (02:45 -0400)]

charset: Updated copyright year, and set to version 0.3

commit | commitdiff | tree

Joey Adams [Mon, 15 Aug 2011 06:41:40 +0000 (02:41 -0400)]

ccan_tokenizer: Corrected LICENSE link so it points to BSD-3CLAUSE.

commit | commitdiff | tree

Joey Adams [Mon, 15 Aug 2011 06:36:33 +0000 (02:36 -0400)]

btree: Changed license from BSD-3 to MIT, and set version to 0.2

NOTE: btree was originally copyright 2010, and has not been
touched by me since then. I don't know if changing the license
to something more permissive requires updating the copyright year
or not.

commit | commitdiff | tree

Joey Adams [Mon, 15 Aug 2011 06:29:05 +0000 (02:29 -0400)]

stringmap: Corrected LICENSE link so it points to BSD-3CLAUSE.

commit | commitdiff | tree

Joey Adams [Mon, 15 Aug 2011 06:26:17 +0000 (02:26 -0400)]

avl: Added LICENSE link, and set version to 0.1

commit | commitdiff | tree

Joey Adams [Mon, 15 Aug 2011 06:06:17 +0000 (02:06 -0400)]

block_pool: Changed license from BSD-3 to MIT, and set version to 0.1

NOTE: block_pool was originally copyright 2009, and has not been
touched by me since then. I don't know if changing the license
to something more permissive requires updating the copyright year
or not.

commit | commitdiff | tree

Joey Adams [Mon, 15 Aug 2011 05:47:58 +0000 (01:47 -0400)]

darray: Changed license from BSD-3 to MIT, and updated copyright year.

commit | commitdiff | tree

Rusty Russell [Sun, 14 Aug 2011 01:04:44 +0000 (10:34 +0930)]

opt: complete coverage, enhance opt_free_table.

No point checking malloc failure in usage(), since we don't elsewhere.
We get 100% coverage with -O (due to code elimination) or 64 bit.

commit | commitdiff | tree

Rusty Russell [Sun, 14 Aug 2011 00:35:41 +0000 (10:05 +0930)]

opt: fix warnings in test, fix endian assumptions.

In particular, handing an pointer to ULL where a pointer to UL is expected
won't work on big endian.

commit | commitdiff | tree

Douglas Bagnall [Sat, 13 Aug 2011 12:31:42 +0000 (22:01 +0930)]

opt: functions to show integer values with kMGTPE suffixes

As with the set_ functions, there are twelve permutations of integer size,
base, and signedness. The supported sizes are int, long, and long long.

For example, this:

char buf1[OPT_SHOW_LEN];
char buf2[OPT_SHOW_LEN];
unsigned i = 1024000;
opt_show_uintval_bi(buf1, &i);
opt_show_uintval_si(buf2, &i);

will put "1000k" in buf1, and "1024k" in buf2.

Unlike the opt_set_ functions, these use unsigned arithmetic for unsigned values.

(32 bit bug using sizeof(suffixes) instead of strlen(suffixes) fixed by Rusty)

commit | commitdiff | tree

Douglas Bagnall [Sat, 13 Aug 2011 12:19:59 +0000 (21:49 +0930)]

opt: incidental comment and whitespace repair

This comment occurred in a couple of places:

/* Set an integer value, various forms. Sets to 1 on arg == NULL. */

One instance was clearly spurious, while the other was misleading.

Another resolution to this mismatch would be to add
"if (arg == NULL){*l = 1; return NULL}" somewhere, but I suspect
it may have been left out/removed because someone thought better.

commit | commitdiff | tree

Douglas Bagnall [Sat, 13 Aug 2011 12:19:59 +0000 (21:49 +0930)]

opt: add integer helpers that accept k, M, G, T, P, E suffixes

These functions come in two flavours: those ending with "_si", which
have 1000-based interpretations of the suffixes; and those ending with
"_bi", which use base 1024.  There are versions for signed and
unsigned int, long, and long long destinations, with tests for all 12
new functions.  The tests get a bit repetitive, I am afraid.

As an example, if the -x option were using the opt_set_intval_bi
function, then all of these would do the same thing:

$ foo -x 5M
$ foo -x $((5 * 1024 * 1024))
$ foo -x 5242880
$ foo -x 5120k

quite what that thing is depends on the size of your int -- people
with 16 bit ints would see an "out of range" error message.

The arithmetic for unsigned variations is actually done using signed
long long integers, so the maximum possible value is LLONG_MAX, not
ULLONG_MAX.  This follows the practice of existing functions, and
avoids tedious work.

commit | commitdiff | tree

Rusty Russell [Mon, 1 Aug 2011 08:29:09 +0000 (17:59 +0930)]

failtest: fix silent exit when top-level return FAIL_PROBE

We were missing failed tests: if the top-level returns FAIL_PROBE, we would
exit; this should only apply to children.

commit | commitdiff | tree

Rusty Russell [Mon, 1 Aug 2011 08:29:08 +0000 (17:59 +0930)]

tdb2: fix line numbers for tests.

commit | commitdiff | tree

Rusty Russell [Fri, 22 Jul 2011 12:13:39 +0000 (21:43 +0930)]

cast: downgrade license from LGPL3+ to LGPLv2.1+

Kirill A. Shutemov asked for libgit. I would say they should upgrade their
license, but libhx on which these are based is also LGPLv2.1 or later, so
I prefer to match that.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:20:00 +0000 (14:50 +0930)]

isaac, crcsync: acknowledge licensing issues.

The recently added ccanlint licensing checks revealed several cases
where the published license of a module is misleading: a dependency of
that module has a stricter license (eg. a public domain module which
depends on a GPL one).

Where these are my modules, I've fixed them. Otherwise I'm overriding
the checks for the moment, and asking the authors what they want to do.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:20:00 +0000 (14:50 +0930)]

ccan/noerr: fix compiler warning with const strings.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:19:56 +0000 (14:49 +0930)]

various: add LICENSE comments.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:14:50 +0000 (14:44 +0930)]

ccanlint: handle DOS-style \r\n lines when parsing.

We don't correctly detect pure-comment lines in ccan/ttxml/ttxml.c
without this.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:14:50 +0000 (14:44 +0930)]

tdb2: add full LGPL headers

This is for SAMBA, so we follow their rules and do full license
headers. Two files were missing them.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:14:50 +0000 (14:44 +0930)]

container_of: relicense to Public domain

Too trivial to deserve LGPL, and all my code.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:14:49 +0000 (14:44 +0930)]

check_type: relicense to Public domain

Too trivial to deserve LGPL, and all my code.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:14:49 +0000 (14:44 +0930)]

htable: relicense under LGPL

Various LGPL components depend on it, via ccan/likely. ccan/likely
really only needs it when CCAN_LIKELY_DEBUG is set, but making it a
conditional dependency is a bit nasty if defining that changes the
license.

So this is the simplest fix. I might relicense under PD or BSD later,
since the likely module should probably have an even more liberal
license.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 05:14:46 +0000 (14:44 +0930)]

ccanlint: license_depends_compat checks dependencies are compatible.

We don't check external dependencies, but internal ccan deps are
pretty easy.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 04:59:06 +0000 (14:29 +0930)]

ccanlint: move license tag matching into common code.

Refactoring helps the next patch.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 04:59:06 +0000 (14:29 +0930)]

wwviaudio: fix license in _info, symlink (LGPL -> GPL)

Comments in code indicate this is actually GPL version 2 or later.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 04:59:06 +0000 (14:29 +0930)]

ogg_to_pcm: fix license in _info, symlink (LGPL -> GPLv2)

Comments in code indicate this is actually GPL version 2 only.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 04:59:06 +0000 (14:29 +0930)]

md4: fix license

As ccanlint now says:
Source files don't contain incompatible licenses (license_file_compat): FAIL
/home/rusty/devel/cvs/ccan/ccan/md4/md4.c:Found boilerplate for license 'GPLv2+' which is incompatible with 'LGPLv2+'

This is actually GPL code!

Add LICENSE link, too.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 04:59:03 +0000 (14:29 +0930)]

ccanlint: check for incompatible license boilerplates within subfiles.

This checks to make sure you're not accidentally relicensing code;
eg. it's OK (though a bit impolite) to turn a BSD-licensed file into a
GPL module, but not the other way around.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 03:32:27 +0000 (13:02 +0930)]

ccanlint: add simple check for comment referring to LICENSE file.

After discussion with various developers (particularly the Samba
team), there's a consensus that a reference to the license in each
source file is useful. Since CCAN modules are designed to be cut and
paste, this helps avoid any confusion should the LICENSE file go
missing.

We also detect standard boilerplates, in which case a one-line summary
isn't necessary.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 03:32:04 +0000 (13:02 +0930)]

noerr: relicense to public domain.

We really want everyone to be using these; establishing conventions
helps all code, so make it the most liberal license possible. It's
all my code, so I can do this unilaterally.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 03:31:54 +0000 (13:01 +0930)]

alignof: relicense to public domain.

Trivial code, all mine.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 03:31:49 +0000 (13:01 +0930)]

build_assert: relicense to public domain.

Trivial code, all mine.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 03:31:45 +0000 (13:01 +0930)]

short_types: relicense to public domain.

We really want everyone to be using these; establishing conventions
helps all code, so make it the most liberal license possible. It's
all my code, so I can do this unilaterally.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 03:31:39 +0000 (13:01 +0930)]

compiler: relicense to public domain.

We really want everyone to be using these; establishing conventions
helps all code, so make it the most liberal license possible. It's
all my code, so I can do this unilaterally.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Jul 2011 02:26:15 +0000 (11:56 +0930)]

ccanlint: tighten license check.

Now we've made GPL wording uniform, use it everywhere. There's no
point allowing variants which might be unclear.

We still have some non-conformant licenses in the tree (eg. just "BSD"),
so we only warn on unknown license strings for now.

commit | commitdiff | tree

Rusty Russell [Tue, 19 Jul 2011 08:02:40 +0000 (17:32 +0930)]

various: make the _info License: wording uniform for GPL variants.

GPL versions 2 and 3 both specifically mention "any later version" as
the phrase which allows the user to choose to upgrade the license.
Make sure we use that phrase, and make the format consistent across
modules.

commit | commitdiff | tree

Rusty Russell [Tue, 19 Jul 2011 08:00:49 +0000 (17:30 +0930)]

ccanlint: make a license enum, and parse the license string to set it.

This improves on the current ad-hoc methods, and also fixes a bug where
we mapped "GPLv2" to the GPLv3 symlink.

commit | commitdiff | tree

Rusty Russell [Wed, 6 Jul 2011 05:21:54 +0000 (14:51 +0930)]

Merge branch 'master' of ozlabs.org:ccan

commit | commitdiff | tree

Rusty Russell [Wed, 6 Jul 2011 05:11:17 +0000 (14:41 +0930)]

tally: don't use SIZE_MAX.

Turns out it's not standard (thanks Samba build farm!)
And the previous test had a hole in it anyway. This one is more conservative.

commit | commitdiff | tree

Rusty Russell [Mon, 4 Jul 2011 07:27:03 +0000 (16:57 +0930)]

tap: WANT_PTHREAD not HAVE_PTHREAD

I'm not sure that a "pthread-safe" tap library is very useful; how many
people have multiple threads calling ok()?

Kirill Shutemov noted that it gives a warning with -Wundef; indeed, we
should ask in this case whether they want pthread support, not whether the
system has pthread support to offer.

commit | commitdiff | tree

Joey Adams [Sat, 2 Jul 2011 16:33:00 +0000 (12:33 -0400)]

json: Deleted the "notes" file.

This file contains my private ramblings about the JSON module,
and was not meant to be included in the public release.

commit | commitdiff | tree

Joey Adams [Thu, 30 Jun 2011 06:39:16 +0000 (02:39 -0400)]

json: new module for parsing and generating JSON

commit | commitdiff | tree

Rusty Russell [Tue, 21 Jun 2011 01:13:31 +0000 (10:43 +0930)]

tally: fix FreeBSD compile, memleak in tests.

Posix says ssize_t is in sys/types.h; on Linux stdlib.h is enough.

commit | commitdiff | tree

Russell Steicke [Fri, 17 Jun 2011 07:42:13 +0000 (15:42 +0800)]

antithread: patch to antithread arabella example

I've been using the antithread arabella example to generate some
"arty" portraits for decoration. I've made a few changes to it
(triangle sizes and number of generations before giving up), and may
send those as patches later.

Because some of the images I'm generating have taken quite a while
(many days) I've needed to restart the run after rebooting machines
for other reasons, and noticed that arabella restarted the generation
count from zero. I wanted to continue the generation count, so here's
a patch to do just that.

commit | commitdiff | tree

Rusty Russell [Fri, 17 Jun 2011 05:13:25 +0000 (14:43 +0930)]

tdb2: Add tools/tdb2dump, tools/tdb2restore, use "tdb2.h" includes.

Simple port from the TDB1 versions. Also, change to "tdb2.h" includes
so they can be built even in other directories in future.

commit | commitdiff | tree

Rusty Russell [Fri, 17 Jun 2011 05:11:55 +0000 (14:41 +0930)]

tdb2: rename the tools to tdb2torture, tdb2tool and mktdb2

This means they can be installed in parallel with tdb1's tools.

commit | commitdiff | tree

Rusty Russell [Fri, 17 Jun 2011 02:57:44 +0000 (12:27 +0930)]

tdb2: use ccan/endian

This is where we should be getting bswap_64 from.

commit | commitdiff | tree

Rusty Russell [Thu, 16 Jun 2011 03:03:23 +0000 (12:33 +0930)]

tools: trim leading whitespace in documentation extract.

Take some care to preserve formatting, even with mixed tabs and spaces.

commit | commitdiff | tree

Joey Adams [Wed, 15 Jun 2011 02:13:01 +0000 (22:13 -0400)]

charset: Added utf8_validate_char (factored out of utf8_validate).

commit | commitdiff | tree

Joey Adams [Sat, 11 Jun 2011 07:58:10 +0000 (03:58 -0400)]

charset: Rewrote utf8_validate, and added four new functions:

* utf8_read_char
* utf8_write_char
* from_surrogate_pair
* to_surrogate_pair

commit | commitdiff | tree

Rusty Russell [Wed, 8 Jun 2011 07:44:36 +0000 (17:14 +0930)]

hash: remove VALGRIND #ifdef - always run clean.

My simple test program on my laptop showed that with modern 32 bit Intel
CPUs and modern GCC, there's no measurable penalty for the clean version.

Andrew Bartlett complained that the valgrind noise was grating. Agreed.

commit | commitdiff | tree

Rusty Russell [Sun, 5 Jun 2011 00:42:41 +0000 (10:12 +0930)]

lbalance: add examples.

commit | commitdiff | tree

Rusty Russell [Tue, 31 May 2011 04:14:48 +0000 (13:44 +0930)]

lbalance: new module for load balancing

commit | commitdiff | tree

Rusty Russell [Tue, 31 May 2011 04:14:36 +0000 (13:44 +0930)]

time: new module for dealing with time.

commit | commitdiff | tree

Joey Adams [Sun, 29 May 2011 02:23:55 +0000 (22:23 -0400)]

cast, container_of, tlist: Fix warning with GCC 4.6: -Wunused-but-set-variable

commit | commitdiff | tree

Rusty Russell [Fri, 20 May 2011 07:12:45 +0000 (16:42 +0930)]

ttxml: new module.

commit | commitdiff | tree

Rusty Russell [Fri, 20 May 2011 06:23:12 +0000 (15:53 +0930)]

tdb2: fix O_RDONLY opens.

We tried to get a F_WRLCK on the open lock; we shouldn't do that for a
read-only tdb. (TDB1 gets away with it because a read-only open skips
all locking).

We also avoid leaking the fd in two tdb_open() failure paths revealed
by this extra testing.

commit | commitdiff | tree

Rusty Russell [Fri, 20 May 2011 06:21:33 +0000 (15:51 +0930)]

failtest: failtest_has_failed()

Allows tests to explicitly avoid continuing when a failure has been
injected.

commit | commitdiff | tree

Rusty Russell [Fri, 20 May 2011 06:20:58 +0000 (15:50 +0930)]

failtest: override getpid() as well.

TDB2 tracks locks using getpid(), and gets upset when we fork behind
its back.

commit | commitdiff | tree

Rusty Russell [Tue, 10 May 2011 04:38:59 +0000 (14:08 +0930)]

typesafe_cb: don't use HAVE_CAST_TO_UNION in tests.

This crept in, it should be the same as the tests in typesafe_cb.h.

commit | commitdiff | tree

Rusty Russell [Tue, 10 May 2011 00:33:50 +0000 (10:03 +0930)]

tdb2: more stats

More recording of interesting events. As we don't have an ABI yet, we
don't need to put these at the end.

commit | commitdiff | tree

Rusty Russell [Tue, 10 May 2011 01:37:21 +0000 (11:07 +0930)]

tdb2: check pid before unlock.

The original code assumed that unlocking would fail if we didn't have a lock;
this isn't true (at least, on my machine). So we have to always check the
pid before unlocking.

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 13:30:25 +0000 (23:00 +0930)]

tdb2: fix msync() arg

PAGESIZE used to be defined to getpagesize(); we changed it to a
constant in b556ef1f, which broke the msync() call.

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 13:41:02 +0000 (23:11 +0930)]

tdb2: use direct access functions when creating recovery blob

We don't need to copy into a buffer to examine the old data: in the
common case, it's mmaped already.  It's made a bit trickier because
the tdb_access_read() function uses the current I/O methods, so we
need to restore that temporarily.

The difference was in the noise, however (the sync no-doubt
dominates).

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m45.021s
user 0m16.261s
sys 0m2.432s
-rw------- 1 rusty rusty 364469344 2011-04-27 22:55 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m10.144s
user 0m0.480s
sys 0m0.460s
-rw------- 1 rusty rusty 391992 2011-04-27 22:56 torture.tdb
Adding 2000000 records:  863 ns (110601144 bytes)
Finding 2000000 records:  565 ns (110601144 bytes)
Missing 2000000 records:  383 ns (110601144 bytes)
Traversing 2000000 records:  409 ns (110601144 bytes)
Deleting 2000000 records:  676 ns (225354680 bytes)
Re-adding 2000000 records:  784 ns (225354680 bytes)
Appending 2000000 records:  1191 ns (247890168 bytes)
Churning 2000000 records:  2166 ns (423133432 bytes)

After:
real 0m47.141s
user 0m16.073s
sys 0m2.460s
-rw------- 1 rusty rusty 364469344 2011-04-27 22:58 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m4.207s
user 0m0.416s
sys 0m0.504s
-rw------- 1 rusty rusty 313576 2011-04-27 22:59 torture.tdb
Adding 2000000 records:  874 ns (110601144 bytes)
Finding 2000000 records:  565 ns (110601144 bytes)
Missing 2000000 records:  393 ns (110601144 bytes)
Traversing 2000000 records:  404 ns (110601144 bytes)
Deleting 2000000 records:  684 ns (225354680 bytes)
Re-adding 2000000 records:  792 ns (225354680 bytes)
Appending 2000000 records:  1212 ns (247890168 bytes)
Churning 2000000 records:  2191 ns (423133432 bytes)

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 13:40:24 +0000 (23:10 +0930)]

tdb2: enlarge transaction pagesize to 64k

We don't need to use 4k for our transaction pages; we can use any
value.  For the tools/speed benchmark, any value between about 4k and
64M makes no difference, but that's probably because the entire
database is touched in each transaction.

So instead, I looked at tdbtorture to try to find an optimum value, as
it uses smaller transactions.  4k and 64k were equivalent.  16M was
almost three times slower, 1M was 5-10% slower.  1024 was also 5-10%
slower.

There's a slight advantage of having larger pages, both for allowing
direct access to the database (if it's all in one page we can sometimes
grant direct access even inside a transaction) and for the compactness
of our recovery area (since our code is naive and won't combine one
run across pages).

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m47.127s
user 0m17.125s
sys 0m2.456s
-rw------- 1 rusty rusty 366680288 2011-04-27 21:34 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m16.049s
user 0m0.300s
sys 0m0.492s
-rw------- 1 rusty rusty 244472 2011-04-27 21:35 torture.tdb
Adding 2000000 records:  894 ns (110551992 bytes)
Finding 2000000 records:  564 ns (110551992 bytes)
Missing 2000000 records:  398 ns (110551992 bytes)
Traversing 2000000 records:  399 ns (110551992 bytes)
Deleting 2000000 records:  711 ns (225633208 bytes)
Re-adding 2000000 records:  819 ns (225633208 bytes)
Appending 2000000 records:  1252 ns (248196544 bytes)
Churning 2000000 records:  2319 ns (424005056 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m45.021s
user 0m16.261s
sys 0m2.432s
-rw------- 1 rusty rusty 364469344 2011-04-27 22:55 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m10.144s
user 0m0.480s
sys 0m0.460s
-rw------- 1 rusty rusty 391992 2011-04-27 22:56 torture.tdb
Adding 2000000 records:  863 ns (110601144 bytes)
Finding 2000000 records:  565 ns (110601144 bytes)
Missing 2000000 records:  383 ns (110601144 bytes)
Traversing 2000000 records:  409 ns (110601144 bytes)
Deleting 2000000 records:  676 ns (225354680 bytes)
Re-adding 2000000 records:  784 ns (225354680 bytes)
Appending 2000000 records:  1191 ns (247890168 bytes)
Churning 2000000 records:  2166 ns (423133432 bytes)

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 13:26:27 +0000 (22:56 +0930)]

tdb2: try to fit transactions in existing space before we expand.

Currently we use the worst-case-possible size for the recovery area.
Instead, prepare the recovery data, then see whether it's too large.

Note that this currently works out to make the database *larger* on
our speed benchmark, since we happen to need to enlarge the recovery
area at the wrong time now, rather than the old case where its already
hugely oversized.

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m50.366s
user 0m17.109s
sys 0m2.468s
-rw------- 1 rusty rusty 564215952 2011-04-27 21:31 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m23.818s
user 0m0.304s
sys 0m0.508s
-rw------- 1 rusty rusty 669856 2011-04-27 21:32 torture.tdb
Adding 2000000 records:  887 ns (110556088 bytes)
Finding 2000000 records:  556 ns (110556088 bytes)
Missing 2000000 records:  385 ns (110556088 bytes)
Traversing 2000000 records:  401 ns (110556088 bytes)
Deleting 2000000 records:  710 ns (244003768 bytes)
Re-adding 2000000 records:  825 ns (244003768 bytes)
Appending 2000000 records:  1255 ns (268404160 bytes)
Churning 2000000 records:  2299 ns (268404160 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m47.127s
user 0m17.125s
sys 0m2.456s
-rw------- 1 rusty rusty 366680288 2011-04-27 21:34 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m16.049s
user 0m0.300s
sys 0m0.492s
-rw------- 1 rusty rusty 244472 2011-04-27 21:35 torture.tdb
Adding 2000000 records:  894 ns (110551992 bytes)
Finding 2000000 records:  564 ns (110551992 bytes)
Missing 2000000 records:  398 ns (110551992 bytes)
Traversing 2000000 records:  399 ns (110551992 bytes)
Deleting 2000000 records:  711 ns (225633208 bytes)
Re-adding 2000000 records:  819 ns (225633208 bytes)
Appending 2000000 records:  1252 ns (248196544 bytes)
Churning 2000000 records:  2319 ns (424005056 bytes)

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 12:17:58 +0000 (21:47 +0930)]

tdb2: reduce transaction before writing to recovery area.

We don't need to write the whole page to the recovery area if it
hasn't all changed.  Simply skipping the start and end of the pages
which are similar saves us about 20% on growtdb-bench 250000, and 45%
on tdbtorture.  The more thorough examination of page differences
gives us a saving of 90% on growtdb-bench and 98% on tdbtorture!

And we do win a bit on timings for transaction commit:

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 1m4.844s
user 0m15.537s
sys 0m3.796s
-rw------- 1 rusty rusty 626693096 2011-04-27 21:28 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m17.021s
user 0m0.272s
sys 0m0.540s
-rw------- 1 rusty rusty 458800 2011-04-27 21:29 torture.tdb
Adding 2000000 records:  894 ns (110556088 bytes)
Finding 2000000 records:  569 ns (110556088 bytes)
Missing 2000000 records:  390 ns (110556088 bytes)
Traversing 2000000 records:  403 ns (110556088 bytes)
Deleting 2000000 records:  710 ns (244003768 bytes)
Re-adding 2000000 records:  825 ns (244003768 bytes)
Appending 2000000 records:  1262 ns (268404160 bytes)
Churning 2000000 records:  2311 ns (268404160 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m50.366s
user 0m17.109s
sys 0m2.468s
-rw------- 1 rusty rusty 564215952 2011-04-27 21:31 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m23.818s
user 0m0.304s
sys 0m0.508s
-rw------- 1 rusty rusty 669856 2011-04-27 21:32 torture.tdb
Adding 2000000 records:  887 ns (110556088 bytes)
Finding 2000000 records:  556 ns (110556088 bytes)
Missing 2000000 records:  385 ns (110556088 bytes)
Traversing 2000000 records:  401 ns (110556088 bytes)
Deleting 2000000 records:  710 ns (244003768 bytes)
Re-adding 2000000 records:  825 ns (244003768 bytes)
Appending 2000000 records:  1255 ns (268404160 bytes)
Churning 2000000 records:  2299 ns (268404160 bytes)

commit | commitdiff | tree

Rusty Russell [Thu, 21 Apr 2011 01:46:35 +0000 (11:16 +0930)]

tdb2: handle non-transaction-page-aligned sizes in recovery.

tdb1 always makes the tdb a multiple of the transaction page size,
tdb2 doesn't. This means that if a transaction hits the exact end of
the file, we might need to save off a partial page.

So that we don't have to rewrite tdb_recovery_size() too, we simply do
a short read and memset the unused section to 0 (to keep valgrind
happy).

commit | commitdiff | tree

Rusty Russell [Thu, 21 Apr 2011 02:10:25 +0000 (11:40 +0930)]

tdb2: remove tailer from transaction record.

We don't have tailers in tdb2, so it's just 8 bytes of confusing wastage.

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 12:16:20 +0000 (21:46 +0930)]

tdb2: limit coalescing based on how successful we are.

Instead of walking the entire free list, walk 8 entries, or more if we
are successful: the reward is scaled by the size coalesced.

We also move previously-examined records to the end of the list.

This reduces file size with very little speed penalty.

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 1m17.022s
user 0m27.206s
sys 0m3.920s
-rw------- 1 rusty rusty 570130576 2011-04-27 21:17 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m27.355s
user 0m0.296s
sys 0m0.516s
-rw------- 1 rusty rusty 617352 2011-04-27 21:18 torture.tdb
Adding 2000000 records:  890 ns (110556088 bytes)
Finding 2000000 records:  565 ns (110556088 bytes)
Missing 2000000 records:  390 ns (110556088 bytes)
Traversing 2000000 records:  410 ns (110556088 bytes)
Deleting 2000000 records:  8623 ns (244003768 bytes)
Re-adding 2000000 records:  7089 ns (244003768 bytes)
Appending 2000000 records:  33708 ns (244003768 bytes)
Churning 2000000 records:  2029 ns (268404160 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 1m7.096s
user 0m15.637s
sys 0m3.812s
-rw------- 1 rusty rusty 561270928 2011-04-27 21:22 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m13.850s
user 0m0.268s
sys 0m0.492s
-rw------- 1 rusty rusty 429768 2011-04-27 21:23 torture.tdb
Adding 2000000 records:  892 ns (110556088 bytes)
Finding 2000000 records:  570 ns (110556088 bytes)
Missing 2000000 records:  390 ns (110556088 bytes)
Traversing 2000000 records:  407 ns (110556088 bytes)
Deleting 2000000 records:  706 ns (244003768 bytes)
Re-adding 2000000 records:  822 ns (244003768 bytes)
Appending 2000000 records:  1262 ns (268404160 bytes)
Churning 2000000 records:  2320 ns (268404160 bytes)

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 12:14:16 +0000 (21:44 +0930)]

tdb2: use counters to decide when to coalesce records.

This simply uses a 7 bit counter which gets incremented on each addition
to the list (but not decremented on removals).  When it wraps, we walk the
entire list looking for things to coalesce.

This causes performance problems, especially when appending records, so
we limit it in the next patch:

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m59.687s
user 0m11.593s
sys 0m4.100s
-rw------- 1 rusty rusty 752004064 2011-04-27 21:14 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m17.738s
user 0m0.348s
sys 0m0.580s
-rw------- 1 rusty rusty 663360 2011-04-27 21:15 torture.tdb
Adding 2000000 records:  926 ns (110556088 bytes)
Finding 2000000 records:  592 ns (110556088 bytes)
Missing 2000000 records:  416 ns (110556088 bytes)
Traversing 2000000 records:  422 ns (110556088 bytes)
Deleting 2000000 records:  741 ns (244003768 bytes)
Re-adding 2000000 records:  799 ns (244003768 bytes)
Appending 2000000 records:  1147 ns (295244592 bytes)
Churning 2000000 records:  1827 ns (568411440 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 1m17.022s
user 0m27.206s
sys 0m3.920s
-rw------- 1 rusty rusty 570130576 2011-04-27 21:17 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m27.355s
user 0m0.296s
sys 0m0.516s
-rw------- 1 rusty rusty 617352 2011-04-27 21:18 torture.tdb
Adding 2000000 records:  890 ns (110556088 bytes)
Finding 2000000 records:  565 ns (110556088 bytes)
Missing 2000000 records:  390 ns (110556088 bytes)
Traversing 2000000 records:  410 ns (110556088 bytes)
Deleting 2000000 records:  8623 ns (244003768 bytes)
Re-adding 2000000 records:  7089 ns (244003768 bytes)
Appending 2000000 records:  33708 ns (244003768 bytes)
Churning 2000000 records:  2029 ns (268404160 bytes)

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 12:12:58 +0000 (21:42 +0930)]

tdb2: overallocate the recovery area.

I noticed a counter-intuitive phenomenon as I tweaked the coalescing
code: the more coalescing we did, the larger the tdb grew!  This was
measured using "growtdb-bench 250000 10".

The cause: more coalescing means larger transactions, and every time
we do a larger transaction, we need to allocate a larger recovery
area.  The only way to do this is to append to the file, so the file
keeps growing, even though it's mainly unused!

Overallocating by 25% seems reasonable, and gives better results in
such benchmarks.

The real fix is to reduce the transaction to a run-length based format
rather then the naive block system used now.

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m57.403s
user 0m11.361s
sys 0m4.056s
-rw------- 1 rusty rusty 689536976 2011-04-27 21:10 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m24.901s
user 0m0.380s
sys 0m0.512s
-rw------- 1 rusty rusty 655368 2011-04-27 21:12 torture.tdb
Adding 2000000 records:  941 ns (110551992 bytes)
Finding 2000000 records:  603 ns (110551992 bytes)
Missing 2000000 records:  428 ns (110551992 bytes)
Traversing 2000000 records:  416 ns (110551992 bytes)
Deleting 2000000 records:  741 ns (199517112 bytes)
Re-adding 2000000 records:  819 ns (199517112 bytes)
Appending 2000000 records:  1228 ns (376542552 bytes)
Churning 2000000 records:  2042 ns (553641304 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m59.687s
user 0m11.593s
sys 0m4.100s
-rw------- 1 rusty rusty 752004064 2011-04-27 21:14 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m17.738s
user 0m0.348s
sys 0m0.580s
-rw------- 1 rusty rusty 663360 2011-04-27 21:15 torture.tdb
Adding 2000000 records:  926 ns (110556088 bytes)
Finding 2000000 records:  592 ns (110556088 bytes)
Missing 2000000 records:  416 ns (110556088 bytes)
Traversing 2000000 records:  422 ns (110556088 bytes)
Deleting 2000000 records:  741 ns (244003768 bytes)
Re-adding 2000000 records:  799 ns (244003768 bytes)
Appending 2000000 records:  1147 ns (295244592 bytes)
Churning 2000000 records:  1827 ns (568411440 bytes)

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 12:13:23 +0000 (21:43 +0930)]

tdb2: don't start again when we coalesce a record.

We currently start walking the free list again when we coalesce any record;
this is overzealous, as we only care about the next record being blatted,
or the record we currently consider "best".

We can also opportunistically try to add the coalesced record into the
new free list: if it fails, we go back to the old "mark record,
unlock, re-lock" code.

Before:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 1m0.243s
user 0m13.677s
sys 0m4.336s
-rw------- 1 rusty rusty 683302864 2011-04-27 21:03 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m24.074s
user 0m0.344s
sys 0m0.468s
-rw------- 1 rusty rusty 836040 2011-04-27 21:04 torture.tdb
Adding 2000000 records:  1015 ns (110551992 bytes)
Finding 2000000 records:  641 ns (110551992 bytes)
Missing 2000000 records:  445 ns (110551992 bytes)
Traversing 2000000 records:  439 ns (110551992 bytes)
Deleting 2000000 records:  807 ns (199517112 bytes)
Re-adding 2000000 records:  851 ns (199517112 bytes)
Appending 2000000 records:  1301 ns (376542552 bytes)
Churning 2000000 records:  2423 ns (553641304 bytes)

After:
$ time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
real 0m57.403s
user 0m11.361s
sys 0m4.056s
-rw------- 1 rusty rusty 689536976 2011-04-27 21:10 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m24.901s
user 0m0.380s
sys 0m0.512s
-rw------- 1 rusty rusty 655368 2011-04-27 21:12 torture.tdb
Adding 2000000 records:  941 ns (110551992 bytes)
Finding 2000000 records:  603 ns (110551992 bytes)
Missing 2000000 records:  428 ns (110551992 bytes)
Traversing 2000000 records:  416 ns (110551992 bytes)
Deleting 2000000 records:  741 ns (199517112 bytes)
Re-adding 2000000 records:  819 ns (199517112 bytes)
Appending 2000000 records:  1228 ns (376542552 bytes)
Churning 2000000 records:  2042 ns (553641304 bytes)

commit | commitdiff | tree

Rusty Russell [Fri, 25 Mar 2011 04:23:23 +0000 (14:53 +1030)]

tdb2: make internal coalesce() function return length coalesced.

This makes life easier for the next patch.

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 12:09:27 +0000 (21:39 +0930)]

tdb2: expand more slowly.

We took the original expansion heuristic from TDB1, and they just
fixed theirs, so copy that.

Before:

After:
time ./growtdb-bench 250000 10 > /dev/null && ls -l /tmp/growtdb.tdb && time ./tdbtorture -s 0 && ls -l torture.tdb && ./speed --transaction 2000000
growtdb-bench.c: In function ‘main’:
growtdb-bench.c:74:8: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result
growtdb-bench.c:108:9: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result

real 1m0.243s
user 0m13.677s
sys 0m4.336s
-rw------- 1 rusty rusty 683302864 2011-04-27 21:03 /tmp/growtdb.tdb
testing with 3 processes, 5000 loops, seed=0
OK

real 1m24.074s
user 0m0.344s
sys 0m0.468s
-rw------- 1 rusty rusty 836040 2011-04-27 21:04 torture.tdb
Adding 2000000 records:  1015 ns (110551992 bytes)
Finding 2000000 records:  641 ns (110551992 bytes)
Missing 2000000 records:  445 ns (110551992 bytes)
Traversing 2000000 records:  439 ns (110551992 bytes)
Deleting 2000000 records:  807 ns (199517112 bytes)
Re-adding 2000000 records:  851 ns (199517112 bytes)
Appending 2000000 records:  1301 ns (376542552 bytes)
Churning 2000000 records:  2423 ns (553641304 bytes)

commit | commitdiff | tree

Rusty Russell [Tue, 19 Apr 2011 07:24:41 +0000 (16:54 +0930)]

tdb2: use 64 bit file offsets on 32 bit systems if available.

And testing reveals a latent bug on 32 bit systems.

commit | commitdiff | tree

Rusty Russell [Thu, 21 Apr 2011 04:04:13 +0000 (13:34 +0930)]

tdb2: test lock timeout plugin code.

commit | commitdiff | tree

Rusty Russell [Thu, 7 Apr 2011 01:29:45 +0000 (10:59 +0930)]

tdb2: allow transaction to nest.

This is definitely a bad idea in general, but SAMBA uses nested transactions
in many and varied ways (some of them probably reflect real bugs) and it's
far easier to support them inside tdb2 with a flag.

We already have part of the TDB1 infrastructure in place, so this patch
just completes it and fixes one place where I'd messed it up.

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 11:18:39 +0000 (20:48 +0930)]

tdb2: allow multiple chain locks.

It's probably not a good idea, because it's a recipe for deadlocks if
anyone else grabs any *other* two chainlocks, or the allrecord lock,
but SAMBA definitely does it, so allow it as TDB1 does.

commit | commitdiff | tree

Rusty Russell [Wed, 27 Apr 2011 13:51:32 +0000 (23:21 +0930)]

tdb2: TDB_ATTRIBUTE_STATS access via tdb_get_attribute.

Now we have tdb_get_attribute, it makes sense to make that the method
of accessing statistics. That way they are always available, and it's
probably cheaper doing the direct increment than even the unlikely()
branch.

commit | commitdiff | tree

Rusty Russell [Thu, 7 Apr 2011 11:18:33 +0000 (20:48 +0930)]

tdb2: make tdb_name() valid early in tdb_open()

Otherwise tdb_name() can be NULL in log functions. And we might as
well allocate it with the tdb, as well.

commit | commitdiff | tree

Rusty Russell [Thu, 7 Apr 2011 11:18:27 +0000 (20:48 +0930)]

tdb2: fix an error message misspelling.

commit | commitdiff | tree

Rusty Russell [Thu, 7 Apr 2011 07:22:35 +0000 (16:52 +0930)]

tdb2: tdb_set_attribute, tdb_unset_attribute and tdb_get_attribute

It makes sense for some attributes to be manipulated after tdb_open,
so allow that.

commit | commitdiff | tree

Rusty Russell [Thu, 7 Apr 2011 07:18:43 +0000 (16:48 +0930)]

tdb2: TDB_ATTRIBUTE_FLOCK support

This allows overriding of low-level locking functions. This allows
special effects such as non-blocking operations, and lock proxying.

We rename some local lock vars to l to avoid -Wshadow warnings.

commit | commitdiff | tree

Rusty Russell [Wed, 6 Apr 2011 23:00:39 +0000 (08:30 +0930)]

tdb2: don't cancel transaction when tdb_transaction_prepare_commit fails

And don't double-log. Both of these cause problems if we want to do
tdb_transaction_prepare_commit non-blocking (and have it fail so we can
try again).

commit | commitdiff | tree

Rusty Russell [Thu, 7 Apr 2011 04:21:54 +0000 (13:51 +0930)]

tdb2: open hook for implementing TDB_CLEAR_IF_FIRST

This allows the caller to implement clear-if-first semantics as per
TDB1. The flag was removed for good reasons: performance and
unreliability, but SAMBA3 still uses it widely, so this allows them to
reimplement it themselves.

(There is no way to do it without help like this from tdb2, since it has
to be done under the open lock).

commit | commitdiff | tree

Rusty Russell [Tue, 10 May 2011 01:45:04 +0000 (11:15 +0930)]

tdb2: cleanups for tools/speed.c

1) The logging function needs to append a \n.
2) The transaction start code should be after the comment and print.
3) We should run tdb_check to make sure the database is OK after each op.

commit | commitdiff | tree

Rusty Russell [Thu, 7 Apr 2011 03:46:35 +0000 (13:16 +0930)]

tdb2: rearrange log function to put data arg at the end.

Also, rename private logfn to log_fn for consistency with other members.

The Comprehensive C Archive Network

RSS Atom