Re: printk: what is going on with additional newlines?

From: Petr Mladek
Date: Wed Sep 06 2017 - 03:56:07 EST


On Tue 2017-09-05 22:42:28, Sergey Senozhatsky wrote:
> On (09/05/17 14:21), Petr Mladek wrote:
> [..]
> > > that's why I want buffered printk to re-use the printk-safe buffer
> > > on that particular CPU [ if buffered printk will ever land ].
> > > printk-safe buffer is not allocated on stack, or kmalloc-ed for
> > > temp usafe, and, more importantly, we flush it from panic().
> > >
> > > and I'm not sure that lost messages due to missing panic flush()
> > > can really be an option even for a single cont line buffer. well,
> > > may be it can. printk has a sort of guarantee that messages will
> > > be at some well known location when pr_foo or printk function
> > > returns. buffered printk kills it. and I don't want to have
> > > several "flavors" of printk. printk-safe buffer seems to be the
> > > way to preserve that guarantee.
> >
> > But the well known locations would help only when they are flushed
> > in panic() or when a crashdump is created. They do not help
> > in other cases, especially where there is a sudden death.
>
> if the system locked up and there is no panic()->flush_on_panic(),
> no console_unlock(), crashdump, no nothing - then even having
> messages in the logbuf is probably not really helpful. you can't
> reach them anyway :)
> so yes, I'm speaking here about the cases when we flush_on_panic()
> or/and generate crash dump.

Why are we that much paranoid about the locked up system when
discussing the console handling offload (printk kthread)?
Why should we be more relaxed when talking about pushing
messages from extra buffers?


> > There are many fears that printk offloading does not have enough
> > guarantees to actually happen. IMHO, there must be similar fears
> > that the messages in a temporary buffer will never get flushed.
> >
> > And there are more risks with this approach:
> >
> > + soft-lockups caused by disabled preemption; we would
> > need this to stay on the same CPU and use the same buffer
>
> well, yes. like any control path that disables IRQs there are
> rules to follow. so printk-safe based solution has limitations.
> I mentioned them probably every time I speak about printk-safe
> buffering. but those limitations come with a bonus - flush on
> panic and well known location of the messages.
>
> one thing to notice, is that
> printk-safe is usually faster than printk() or at least as fast as
> the fastest printk() path. because, unlike printk, it does not take
> spin on the logbuf lock; it does not console_trylock(), it does not
> do console_unlock().
>
>
> > + broken preempt-count and missing message when one forgets
> > to close the buffered section or do it twice
>
> yes, coding errors are possible.
>
>
> > + lost messages because a per-CPU buffer size limitations
>
> which is true for any type of buffers. including logbuf. and
> stack allocated buffers, any buffer. printk-safe buffer is at
> least much-much bigger than any stack allocated buffer.
>
>
> > + races in printk_safe() that is not recursions safe
> >
> > + not to say the problems mentioned by Linus as reply
> > to the Tetsuo's proposal, see
> > https://lkml.kernel.org/r/CA+55aFx+5R-vFQfr7+Ok9Yrs2adQ2Ma4fz+S6nCyWHY_-2mrmw@xxxxxxxxxxxxxx
>
> like "limited in where you can actually expect buffering to happen"?
>
> sure. it does not come for free, it's not all beautiful and shiny.

It is great that we see the risks and limitations.

>
> [..]
> > I wonder if all this is worth the effort, complexity, and risk.
> > We are talking about cosmetic problems after all.
>
> the thing about printk-safe buffering is that _mostly_ everything
> is already in the kernel. especially if we talk about single cont
> line buffering. just add public API printk_buffering_begin() and
> printk_buffering_end() that will __printk_safe_enter() and
> __printk_safe_exit(). and that's it. unless I'm missing something.
>
> but I'm not super eager to have printk-safe based buffering.
> that's why I never posted a patch set. this approach has its
> limitations.

Ah, I am happy to read this. From the previous mails,
I got the feeling that you were eager to go this way.

I personally do not feel comfortable with taking all the risks
and limitations just to avoid mixed messages.

To be more precise. I am more and more pessimistic about
getting a safe buffer-based solution for multiple lines.

Well, it might make some sense for continuous lines. The
entire line should get printed within few lines of code
and limited time. Otherwise people could hardly expect
to see the pieces together. Then all the above risks and
limitations might be small and acceptable.


> > Well, what do you think about the extra printed information?
> > For example:
> >
> > <timestamp> <PID> <context> message
> >
> > It looks straightforward to me. These information
> > might be helpful on its own. So, it might be a
> > win-win solution.
>
> hm... don't know. frankly, I never found PID useful. I mostly look
> at the serial logs postmortem. so lines
> 12231 foo
> 21331 bar
>
> are not much better than just
> foo
> bar

Sure, the main intention is to allow greping.


> I prepend every line with the CPU number that has printk()-ed it.
> and that's helpful because one can grep and filter out messages
> from other CPUs. it's quite OK thing to have given that messages
> can be really mixed sometimes.
>
> so adding extra information to `struct printk_log' could be helpful.
> I think we had this discussion before and you didn't want to change
> the size of `struct printk_log' because that might break gdb/crash/etc
> user space tools. has it changed?

Yup, there should be a serious reason to change 'struct printk_log'.
I am not sure if this is the case. But I am sure that there will
be need to change the structure sooner or later.

Anyway, it seems that we will need to update all the tools
for the different time stamps, see
https://lkml.kernel.org/r/1504613201-23868-1-git-send-email-prarit@xxxxxxxxxx
Then we will be more clever how painful it is.


> may be we can #ifdef CONFIG_PRINTK_ABC them.

I agree that this kind of change should be optional.

Best Regards,
Petr