Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes
From: Steven Rostedt
Date: Fri Nov 03 2017 - 07:21:30 EST
On Thu, 2 Nov 2017 21:09:32 -0700
John Hubbard <jhubbard@xxxxxxxxxx> wrote:
> On 11/02/2017 03:16 PM, Vlastimil Babka wrote:
> > On 11/02/2017 06:45 PM, Steven Rostedt wrote:
> > ...> __DEVKMSG_LOG_BIT_ON = 0,
> >> __DEVKMSG_LOG_BIT_OFF,
> >> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> >> * semaphore. The release will print out buffers and wake up
> >> * /dev/kmsg and syslog() users.
> >> */
> >> - if (console_trylock())
> >> + if (console_trylock()) {
> >> console_unlock();
> >> + } else {
> >> + struct task_struct *owner = NULL;
> >> + bool waiter;
> >> + bool spin = false;
> >> +
> >> + printk_safe_enter_irqsave(flags);
> >> +
> >> + raw_spin_lock(&console_owner_lock);
> >> + owner = READ_ONCE(console_owner);
> >> + waiter = READ_ONCE(console_waiter);
> >> + if (!waiter && owner && owner != current) {
> >> + WRITE_ONCE(console_waiter, true);
> >> + spin = true;
> >> + }
> >> + raw_spin_unlock(&console_owner_lock);
> >> +
> >> + /*
> >> + * If there is an active printk() writing to the
> >> + * consoles, instead of having it write our data too,
> >> + * see if we can offload that load from the active
> >> + * printer, and do some printing ourselves.
> >> + * Go into a spin only if there isn't already a waiter
> >> + * spinning, and there is an active printer, and
> >> + * that active printer isn't us (recursive printk?).
> >> + */
> >> + if (spin) {
> >> + /* We spin waiting for the owner to release us */
> >> + spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
> >> + /* Owner will clear console_waiter on hand off */
> >> + while (!READ_ONCE(console_waiter))
> >
> > This should not be negated, right? We should spin while it's true, not
> > false.
> >
>
> Vlastimil's right about the polarity problem above, but while I was trying
> to verify that, I noticed another problem: the "handoff" of the console lock
> is broken.
>
> For example, if there are 3 or more threads, you can do the following:
>
> thread A: holds the console lock, is printing, then moves into the console_unlock
> phase
>
> thread B: goes into the waiter spin loop above, and (once the polarity is corrected)
> waits for console_waiter to become 0
>
> thread A: finishing up, sets console_waiter --> 0
>
> thread C: before thread B notices, thread C goes into the "else" section, sees that
> console_waiter == 0, and sets console_waiter --> 1. So thread C now
> becomes the waiter
But console_waiter only gets set to 1 if console_waiter is 0 *and*
console_owner is not NULL and is not current. console_owner is only
updated under a spin lock and console_waiter is only set under a spin
lock when console_owner is not NULL.
This means this scenario can not happen.
>
> thread B: gets *very* unlucky and never sees the 1 --> 0 --> 1 transition of
> console_waiter, so it continues waiting. And now we have both B
> and C in the same spin loop, and this is now broken.
>
> At the root, this is really due to the absence of a pre-existing "hand-off this lock"
> mechanism. And this one here is not quite correct.
>
> Solution ideas: for a true hand-off, there needs to be a bit more information
> exchanged. Conceptually, a (lock-protected) list of waiters (which would
> only ever have zero or one entries) is a good way to start thinking about it.
As stated above, the console owner check will prevent this issue.
-- Steve
>
> I talked it over with Mark Hairgrove here, he suggested a more sophisticated
> way of doing that sort of hand-off, using compare-and-exchange. I can turn that
> into a patch if you like (I'm not as fast as some folks, so I didn't attempt to
> do that right away), although I'm sure you have lots of ideas on how to do it.
>