Re: [PATCH 2/2] printk: always report lost messages on serial console

From: Petr Mladek
Date: Wed Jan 04 2017 - 10:26:54 EST


On Wed 2017-01-04 22:34:48, Sergey Senozhatsky wrote:
> On (01/04/17 11:52), Petr Mladek wrote:
> [..]
> > > this is from the real serial logs I'm looking at right now. we attach
> > > "bad news" to 'critical' messages only:
> > >
> > > ...
> > > [ 32.941061] bc00: b65dc0d8 b65dc6d0 ae1fbc7c b65c11c5 b563c9c4 00000001 b65dc6d0 b0a62f74
> > > ** 150 printk messages dropped ** [ 32.941614] cee0: 00000081 ae1fcef0 00000038 00000000 b5369000 ae1fcf00 ae1fd0f0 00000000
> > > ** 75 printk messages dropped ** [ 32.941892] d860: 00056608 af203848 00000000 0004088c 000000d0 00000000 00000000 00000000
> > > ** 12 printk messages dropped ** [ 32.941940] ..
> > > ** 2 printk messages dropped ** [ 32.941951] ..
> > > ** 10 printk messages dropped ** [ 32.941992] ..
> > > ** 1 printk messages dropped ** [ 32.941999] ..
> > > ...

OK, it is possible that I miss-interpreted the message. It looked like
a random memory dump that did not make sense without a context.


> > Do you see how useless the above messages are, please?
>
> what... these lost messages were of extreme importance. I can't tell
> the exactly the loglevel, but I'm sure it was at least pr_err() level.
> these were like really important messages, unlike the ones that got
> suppressed/filtered-out.

It means that you were lucky and you saw critical messages instead
of some random debugging ones.


> along with these lost messages that were supposed to be printed, I have
> regions of lost kernel messages (?) with no reports of lost messages from
> console_unlock() (!). and that's the only thing I'm fixing here.

My patch fixes it as well. But it also keeps the function of
console_level filtering.


> and the
> only thing we can fix. no permutation of console_unlock() lines will make
> the buffer bigger or printk flooding CPUs nicer or serial console driver
> faster.

We should stay on a constructive note. I never wrote that my patch
would make the buffer bigger or the serial console faster.


> once we unlocked the logbuf lock in console_unlock() we lost the race
> against the printk() flooding CPUs.

My patch did not have ambition to solve this problem.


> and the options here are
> "print 1 random message out of XXX or XXXX lost messages"
> vs
> "print 1 random message out of XXX or XXXX lost messages"

And this is not fully correct and probably the root of
the misunderstanding. The difference between your patch
and mine patch is:

"always print '%u printk messages dropped'" +
"print 1 random message out of XXX or XXXX lost messages"

vs

"always print '%u printk messages dropped'" +
"print 1 random message with level under console_level
out of XXX or XXXX lost messages"

and that's it. I am sorry if I was not able to explain this
a more clear way.

I think that we both should take a deep breath and calm down
a bit. I am afraid that I used some formulations that made
you angry and put us in an offensive mode. Maybe I was not
able to clearly describe my concerns and their severity.
Maybe you feel offended because I produced an alternative
patch and did not keep enough credits to you. I am sorry
for this. I will try better next time.

Best Regards,
Petr