Re: locking API: was: [PATCH printk v1 00/18] serial: 8250: implement non-BKL console

From: John Ogness
Date: Tue Mar 28 2023 - 17:49:04 EST


On 2023-03-28, Petr Mladek <pmladek@xxxxxxxx> wrote:
>> If an atomic context loses ownership while doing certain activities,
>> it may need to re-acquire ownership in order to finish or cleanup
>> what it started.
>
> This sounds suspicious. If a console/writer context has lost the lock
> then all shared/locked resources might already be used by the new
> owner.

Correct.

> I would expect that the context could touch only non-shared resources
> after loosing the lock.

Correct.

> If it re-acquires the lock then the shared resource might be in
> another state. So, doing any further changes might be dangerous.

That is the responsibility of the driver to implement it safely if it is
needed.

> I could imagine that incrementing/decrementing some counter might
> make sense but setting some value sounds strange.

The 8250 driver must disable interrupts before writing to the TX
FIFO. After writing it re-enables the interrupts. However, it might be
the case that the interrupts were already disabled, in which case after
writing they are left disabled.

IOW, whatever context disabled the interrupts is the context that is
expected to re-enable them. This simple rule makes it really easy to
handle nested printing because a context is only concerned with
restoring the IER state that it originally saw.

Using counters or passing around interrupt re-enabling responsibility
would be considerably trickier.

>>> And why we need to reacquire it?
>>
>> In this particular case the context has disabled interrupts. No other
>> context will re-enable interrupts because the driver is implemented
>> such that the one who disables is the one who enables. So this
>> context must re-acquire ownership in order to re-enable interrupts.
>
> My understanding is that the driver might lose the lock only
> during hostile takeover. Is it safe to re-enable interrupts
> in this case?

Your understanding is incorrect. If a more important outputting context
should arrive, the non-important outputting context will happily and
kindly handover to the higher priority. From the perspective of the
atomic console driver, it lost ownership.

Simple example: The kthread printer is printing and some WARN_ON() is
triggered on another CPU. The warning will be output at a higher
priority and print from the context/CPU of the WARN_ON(). The kthread
printer will lose its ownership by handing over to the warning CPU.

Note that we are _not_ talking about when the unsafe bit is set. We are
talking about a printer that owns the console, is in a safe section, and
loses ownership. If that context was the one that disabled interrupts,
it needs to re-acquire the console in order to safely re-enable the
interrupts. The context that tookover ownership saw that interrupts are
disabled and does _not_ re-enable them when it is finished printing.

John