Re: [RFC 2/2] printk: Add more information about the printk caller
From: Petr Mladek
Date: Fri Sep 25 2020 - 06:20:52 EST
On Fri 2020-09-25 09:54:00, Sergey Senozhatsky wrote:
> On (20/09/24 15:38), Petr Mladek wrote:
> [..]
> >
> > Grrrr, I wonder why I thought that in_irq() covered also the situation
> > when IRQ was disabled. It was likely my wish because disabled
> > interrupts are problem for printk() because the console might
> > cause a softlockup.
>
> preempt_disable() can also trigger softlockup.
>
> > in_irq() actually behaves like in_serving_softirq().
> >
> > I am confused and puzzled now. I wonder what contexts are actually
> > interesting for developers. It goes back to the ideas from Sergey
> > about preemption disabled, ...
>
> Are we talking about context tracking for LOG_CONT or context on
> the serial console and /dev/kmsg?
OK, it is clear that LOG_CONT need to know when a different code
is called suddenly. I mean task code vs. an interrupt handler.
But it was actually also the original purpose of the caller_id.
AFAIK, people wanted to sort related messages when they were mixed
with ones from other CPUs.
> If the latter, then my 5 cents, is that something like preemptible(),
> which checks
>
> (preempt_count() == 0 && !irqs_disabled())
>
> does not look completely unreasonable.
>
> We had a rather OK context tracking in printk() before, but for a
> completely different purpose:
>
> console_may_schedule = !oops_in_progress &&
> preemptible() &&
> !rcu_preempt_depth();
>
> We know that printk() can cause RCU stalls [0]. Tracking this part
> of the context state is sort of meaningful.
>
> Let's look at this from this POV - why do we add in_irq()/etc tracking
> info? Perhaps because we want to connect the dots between printk() caller
> state and watchdog reports. Do we cover all watchdogs? No, I don't think
> so. RCU stalls, local_irq_disable(), preempt_disable() are not covered.
I agree that it would be handy to see this context as well. It might
make it easier when hunting down various lockups and stall. But
I have some concerns.
First, the information is not always reliable (PREEMPT_NONE). I wonder
if it might cause more harm than good. People might get confused
or they might want to fix it by some crazy printk code.
Second, the information might not be detailed enough. Many lockups
depends on the fact that a particular lock is held. Backtraces
are likely more important. Or people would need to distinguish
many contexts. It would require another complex code.
I am not sure that this is woth it. After all, it might be enough
to distinguish the 4 basic contexts just to allow sorting mixed
messages.
Best Regards,
Petr