Re: [PATCH v2 0/4] improve fault-tolerance of rhashtable runtime-test

From: Herbert Xu
Date: Mon Nov 30 2015 - 05:19:16 EST


On Mon, Nov 30, 2015 at 11:14:01AM +0100, Phil Sutter wrote:
> On Mon, Nov 30, 2015 at 05:37:55PM +0800, Herbert Xu wrote:
> > Phil Sutter <phil@xxxxxx> wrote:
> > > The following series aims to improve lib/test_rhashtable in different
> > > situations:
> > >
> > > Patch 1 allows the kernel to reschedule so the test does not block too
> > > long on slow systems.
> > > Patch 2 fixes behaviour under pressure, retrying inserts in non-permanent
> > > error case (-EBUSY).
> > > Patch 3 auto-adjusts the upper table size limit according to the number
> > > of threads (in concurrency test). In fact, the current default is
> > > already too small.
> > > Patch 4 makes it possible to retry inserts even in supposedly permanent
> > > error case (-ENOMEM) to expose rhashtable's remaining problem of
> > > -ENOMEM being not as permanent as it is expected to be.
> >
> > I'm sorry but this patch series is simply bogus.
>
> The whole series?!

Well at least patch two and four seem clearly wrong because no
rhashtable user should need to retry insertions.

> Did you try with my bogus patch series applied? How many CPUs does your
> test system actually have?
>
> > So can someone please help me reproduce this? Because just loading
> > test_rhashtable isn't doing it.
>
> As said, maybe you need to increase the number of spawned threads
> (tcount=50 or so).

OK that's better. I think I see the problem. The test in
rhashtable_insert_rehash is racy and if two threads both try
to grow the table one of them may be tricked into doing a rehash
instead.

I'm working on a fix.

Thanks,
--
Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/