Re: [announce] "kill the Big Kernel Lock (BKL)" tree
From: Arjan van de Ven
Date: Thu May 15 2008 - 17:22:38 EST
On Thu, 15 May 2008 22:45:55 +0200
Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> On Thu, 2008-05-15 at 13:27 -0700, Arjan van de Ven wrote:
> > On Thu, 15 May 2008 10:41:54 -0700 (PDT)
> > Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > >
> > > So looking a bit more at your trivial fixups, I'd suggest strongly
> > > that they be re-organized a bit.
> >
> > > >
> > > > diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> > > > index 6eab9bf..e12e571 100644
> > > > --- a/net/sunrpc/sched.c
> > > > +++ b/net/sunrpc/sched.c
> > > > @@ -224,9 +224,15 @@ EXPORT_SYMBOL_GPL(rpc_destroy_wait_queue);
> > > >
> > > > static int rpc_wait_bit_killable(void *word)
> > > > {
> > > > + int bkl = kernel_locked();
> > > > +
> > > > if (fatal_signal_pending(current))
> > > > return -ERESTARTSYS;
> > > > + if (bkl)
> > > > + unlock_kernel();
> > > > schedule();
> > > > + if (bkl)
> > > > + lock_kernel();
> > > > return 0;
> > > > }
> > >
> > > The above doesn't even work in general. It depends on having just
> > > a single level of locking, and is ugly to boot. So wow about we
> > > just expose some version of
> > >
> > > depth = release_kernel_lock()
> > > ..
> > > reacquire_kernel_lock(depth);
> > >
> > > to existing BKL users as a way to safely release and re-aquire it
> > > regardless of depth. That makes the code more generic, but it
> > > *also* makes it more readable than that "if (bkl)
> > > [un]lock_kernel()" sequence.
> >
> >
> > can we make this even more specific/restricted? Like having
> > something like
> >
> > call_bkl_unlocked(function_pointer, argument);
> >
> > or something that will internally do the full unlock and then the
> > function call. The last thing we need is another nailgun that BKL
> > using code can use to staple themselves to something big and fast
> > moving. By having a more restricted interface... less likely.
> > Maybe we can even get away with only a
> >
> > drop_bkl_and_schedule();
> >
> > and nothing else.
>
> No, that would defeat the whole purpose of the exercise. This drop on
> schedule property makes it possible to have inverse lock order and not
> deadlock.
I would totally agree with you, except that all these patches
effectively do it manually again ANYWAY :(
so what I propose is make it explicit drop_bkl_and_schedule() call only,
and only do them as a very very last resort.
For 99% of the rest it does give exactly the regular benefits you
describe. And we can then prioritize these ugly cases to get de-bkl'd
first.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/