Re: [PATCH] printk: Ratelimit messages printed by console drivers

From: Sergey Senozhatsky
Date: Thu Apr 19 2018 - 22:15:20 EST


On (04/19/18 14:53), Petr Mladek wrote:
> > > >
> > > > Besides 100 lines is absolutely not enough for any real lockdep splat.
> > > > My call would be - up to 1000 lines in a 1 minute interval.
>
> But this would break the intention of this patch.

You picked an arbitrary value and now you are saying that any other
value will not work?

> Come on guys! The first reaction how to fix the infinite loop was
> to fix the console drivers and remove the recursive messages. We are
> talking about messages that should not be there or they should
> get replaced by WARN_ONCE(), print_once() or so. This patch only
> give us a chance to see the problem and do not blow up immediately.
>
> I am fine with increasing the number of lines. But we need to keep
> the timeout long. In fact, 1 hour is still rather short from my POV.

Disagree.

I saw 3 or 4 lockdep reports coming from console drivers. "100 lines"
is way too restrictive. I want to have a complete report; not the first
50 lines, not the first 103 lines, which would "hint" me that "hey, there
is something wrong there, but you are on your own to figure out the rest".

> > > Well, if we want to basically turn printk_safe() into printk_safe_ratelimited().
> > > I'm not so sure about it.
>
> No, it is not about printk_safe(). The ratelimit is active when
> console_owner == current. It triggers when printk() is called
> inside

"console_owner == current" is exactly the point when we call console
drivers and add scheduler, networking, timekeeping, etc. locks to the
picture. And so far all of the lockdeps reports that we had were from
call_console_drivers(). So it very much is about printk_safe().

> > > Besides the patch also rate limits printk_nmi->logbuf - the logbuf
> > > PRINTK_NMI_DEFERRED_CONTEXT_MASK bypass, which is way too important
> > > to rate limit it - for no reason.
>
> Again. It has the effect only when console_owner == current. It means
> that it affects "only" NMIs that interrupt console_unlock() when calling
> console drivers.

What is your objection here? NMIs can come anytime.

> > One more thing,
> > I'd really prefer to rate limit the function which flushes per-CPU
> > printk_safe buffers; not the function that appends new messages to
> > the per-CPU printk_safe buffers.
>
> I wonder if this opinion is still valid after explaining the
> dependency on printk_safe(). In each case, it sounds weird
> to block printk_safe buffers with some "unwanted" messages.
> Or maybe I miss something.

I'm not following.

The fact that some consoles under some circumstances can add unwanted
messages to the buffer does not look like a good enough reason to start
rate limiting _all_ messages and to potentially discard the _important_
ones.

-ss