Re: [PATCH] printk: fix one circular lockdep warning about console_lock

From: Daniel Vetter
Date: Tue Feb 11 2014 - 16:35:38 EST


Aside: I've suggested this fix (but was too lazy to write the actual
patch), hence also my sob on it.

On Tue, Feb 11, 2014 at 10:19 PM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -1893,6 +1893,20 @@ void resume_console(void)
>> }
>>
>> /**
>> + * console_flush - flush dmesg if console isn't suspended
>> + *
>> + * console_unlock always flushes the dmesg buffer, so just try to
>> + * grab&drop the console lock. If that fails we know that the current
>> + * holder will eventually drop the console lock and so flush the dmesg
>> + * buffers at the earliest possible time.
>> + */
>
> The comment should describe why we added this code, please: talk about
> cpu_hotplug.lock and console_lock.

I've suggested this since it's the generic way to flush out dmesg. The
printk functions also use this trick, but with the complication that
console_trylock_for_printk also drops the logbuf log. So imo this is a
general helper to flush the console buffer in odd circumstances and
not specific to the cpu lockdep issue at hand. At least my idea was
that this could be useful to untangle other issues around
console_lock.

>> +void console_flush(void)
>> +{
>> + if (console_trylock())
>> + console_unlock();
>> +}
>> +
>> +/**
>> * console_cpu_notify - print deferred console messages after CPU hotplug
>> * @self: notifier struct
>> * @action: CPU hotplug event
>> @@ -1911,8 +1925,7 @@ static int console_cpu_notify(struct notifier_block *self,
>> case CPU_DEAD:
>> case CPU_DOWN_FAILED:
>> case CPU_UP_CANCELED:
>> - console_lock();
>> - console_unlock();
>> + console_flush();
>> }
>> return NOTIFY_OK;
>
> Well, this is a bit hacky and makes the already-far-too-complex code
> even more complex. If it is indeed the case that the deadlock cannot
> really occur then let's try to find a way of suppressing the lockdep
> warning without making runtime changes.
>
> What I'm struggling with is what *should* the ranking of these locks be?
> From a conceptual high-level design standpoint, which is the
> "innermost" lock? I tend to think that it is console_lock, because
> blocking CPU hotplug is a quite high-level operation.
>
> But console_lock is such a kooky special-case in the way it is used to
> control the printk corking that it is hard to take general rules and
> apply them here.

console_lock is one giant disaster imo, at least from my experience
with it's interactions in fbdev/drm/gpu code - due to bonghits around
how the fbcon code is structured we need to run the entire initial
modeset under the console_lock, which means if anything goes wrong we
can't even dump to net/serial console.

Hence why I've proposed the trylock "fix", prettied up in the
console_flush helper to duct-tape over the lockdep splat at hand. It
actually achieves exactly what the code sets out to do (flush dmesg as
soon as possible) without extending the reach of insane console_lock
interactions into even more code. Because that way lies madness, at
least when I look at our vain attempts to make fbcon and drm work
together properly ...

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/