Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

From: Ingo Molnar
Date: Wed Feb 03 2016 - 04:02:21 EST



* Sergey Senozhatsky <sergey.senozhatsky.work@xxxxxxxxx> wrote:

> On (02/03/16 09:04), Ingo Molnar wrote:
> > * Sergey Senozhatsky <sergey.senozhatsky.work@xxxxxxxxx> wrote:
> >
> > > On (02/03/16 08:28), Ingo Molnar wrote:
> > > [..]
> > > > So why not move printk away from semaphores? Semaphores are classical constructs
> > > > that have legacies and are somewhat non-obvious to use, compared to modern,
> > > > simpler locking primitives. I'd not touch their implementation, unless we are
> > > > absolutely sure this is a safe optimization.
> > >
> > > semaphore's spin_lock is not the only spin lock that printk acquires. it also
> > > takes the logbuf_lock (and different locks in console drivers (up to console
> > > driver)).
> > >
> > > Jan Kara posted a patch that offloads printing job
> > > (console_trylock()-console_unlock()) from printk() call (when printk can offload
> > > it). so semaphore and console driver's locks will go away (mostly) with Jan's
> > > patch. logbug spin_lock, however, will stay.
> >
> > Well, but this patch of yours only affects the semaphore code, so it does not
> > change the logbuf_lock situation.
>
> yes, correct. I just said for the info that there is already 'move printk away from
> console_sem' work in progress. Well, the reason for that work is entirely different,
> though, but this console_sem recursion and console driver's lock recursion can be
> 'fixed as a side effect'.
>
> > Furthermore, logbuf_lock already has recursion protection:
> >
> > /*
> > * Ouch, printk recursed into itself!
> > */
> > if (unlikely(logbuf_cpu == this_cpu)) {
>
> it's good, no doubt. but it doesn't work in all of the cases. a simple one is:
>
> vprintk_emit()
> ...
> raw_spin_lock(&logbuf_lock);
> logbuf_cpu = this_cpu;

Yes, I'm aware of that, and as I said:

> > (There are other ways to get the logbuf_lock - if those are still triggerable
> > then they should be fixed.)

The proper way to fix it would be to factor out the recursion-safe logbuf_lock
taking code into logbuf_lock()/logbuf_unlock() primitives and use those
consistently in printk.c.

> ...
> logbuf_cpu = UINT_MAX;
> raw_spin_unlock(&logbuf_lock); << SPIN_BUG_ON
> ...
>
> if raw_spin_unlock() calls SPIN_BUG_ON, then logbuf_lock recursion detection can't
> help. we recurse into vprintk_emit() with logbuf_lock locked and logbuf_cpu != this_cpu.

If recursion-safe logbuf_lock taking is factored out and extended to other
functions in printk.c then this problem is solved.

> Peter Hurley also posted the following case (I'll quote):
>
> serial8250_do_set_termios()
> spin_lock_irqsave() ** claim port lock **
> ...
> serial_port_out(port, UART_LCR, ....);
> dw8250_serial_out()
> dev_err()
> vprintk_emit()
> console_trylock()
> call_console_drivers()
> serial8250_console_write()
> spin_lock_irqsave() ** port lock **
> ** DEADLOCK **

That too seems to be avoided if vprintk_emit() does not take logbuf_lock naively.

So I repeat my point:

> > In any case, recursion protection is generally done in the debugging facilities
> > trying to behave lockless.

I.e. to address these deadlocks, printk() should be made fundamentally more
robust, extending its already existing recursion protection logic to remaining
parts of printk.c.

That was my expectation when I committed the first variant of printk() recursion
protection code a few years ago, it just never happened.

Thanks,

Ingo