Re: locking API: was: [PATCH printk v1 00/18] serial: 8250: implement non-BKL console

From: Petr Mladek
Date: Tue Mar 28 2023 - 11:12:39 EST


On Tue 2023-03-28 16:03:36, John Ogness wrote:
> On 2023-03-28, Petr Mladek <pmladek@xxxxxxxx> wrote:
> >> + if (!__serial8250_clear_IER(up, wctxt, &ier))
> >> + return false;
> >> +
> >> + if (console_exit_unsafe(wctxt)) {
> >> + can_print = atomic_print_line(up, wctxt);
> >> + if (!can_print)
> >> + atomic_console_reacquire(wctxt, &wctxt_init);
> >
> > I am trying to review the 9th patch adding console_can_proceed(),
> > console_enter_unsafe(), console_exit_unsafe() API. And I wanted
> > to see how the struct cons_write_context was actually used.
>
> First off, I need to post the latest version of the 8250-POC patch. It
> is not officially part of this series and is still going through changes
> for the PREEMPT_RT tree. I will post the latest version directly after
> answering this email.

Sure. I know that it is just a kind of POC.

> > I am confused now. I do not understand the motivation for the extra
> > @wctxt_init copy and atomic_console_reacquire().
>
> If an atomic context loses ownership while doing certain activities, it
> may need to re-acquire ownership in order to finish or cleanup what it
> started.

This sounds suspicious. If a console/writer context has lost the lock
then all shared/locked resources might already be used by the new
owner.

I would expect that the context could touch only non-shared resources after
loosing the lock.

If it re-acquires the lock then the shared resource might be in
another state. So, doing any further changes might be dangerous.

I could imagine that incrementing/decrementing some counter might
make sense but setting some value sounds strange.


> > Why do we need a copy?
>
> When ownership is lost, the context is cleared. In order to re-acquire,
> an original copy of the context is needed. There is no technical reason
> to clear the context, so maybe the context should not be cleared after a
> takeover. Otherwise, many drivers will need to implement the "backup
> copy" solution.

It might make sense to clear values that are not longer valid, e.g.
some state values or .len of the buffer. But I would keep the values
that might still be needed to re-acquire the lock. It might be
needed when the context want to re-start the entire operation,

I guess that you wanted to clean the structure to catch potential
misuse. It makes some sense but the copying is really weird.

I think that we might/should add some paranoid checks into all
functions manipulating the shared state instead.


> > And why we need to reacquire it?
>
> In this particular case the context has disabled interrupts. No other
> context will re-enable interrupts because the driver is implemented such
> that the one who disables is the one who enables. So this context must
> re-acquire ownership in order to re-enable interrupts.

My understanding is that the driver might lose the lock only
during hostile takeover. Is it safe to re-enable interrupts
in this case?

Well, it actually might make sense if the interrupts should
be enabled when the port is unused.

Well, I guess that they will get enabled by the other hostile
owner. It should leave the serial port in a good state when
it releases the lock a normal way.

Anyway, thanks a lot for the info. I still have to scratch my
head around this.

Best Regards,
Petr