Re: ratelimit API: was: [RFC PATCH] printk: Introduce "store now but print later" prefix.

From: Tetsuo Handa
Date: Thu Mar 21 2019 - 04:14:06 EST


On 2019/03/21 0:25, Petr Mladek wrote:
>> This requires serialization among threads using "rs". I already
>> proposed ratelimit_reset() for memcg's OOM problem at
>> https://lkml.kernel.org/r/201810180246.w9I2koi3011358@xxxxxxxxxxxxxxxxxxx
>> but it was not accepted.
>
> IMHO, the main problem was that the patch tried to work around
> the ratelimit API weakness by a custom code.
>
> I believe that using an improved/extended ratelimit API with
> a sane semantic would be more acceptable.
>

Michal, are you OK to use ratelimit_reset() in out_of_memory()
if ratelimit_reset() is accepted?

>
>>> It means that it makes sense to enable the related
>>> ratelimited messages again because they would describe
>>> another problem.
>>
>> ___ratelimit() could also check number of not-yet-flushed
>> printk() records (e.g. log_next_seq - console_seq <= $some_threshold).
>
> The number is almost useless without more information, for example,
> how fast the consoles are, how many lines will get filtered
> by a console_loglevel, if the console_sem owner is sleeping,
> how many messages are being added by other CPUs.
>
> I believe that we do not really need it. The ratelimit_reset()
> user should know when the messages can get skipped because
> they describe the same situation again and again.

If printk() becomes asynchronous (either my "synchronous by default +
console_writer kernel thread" or John's "asynchronous by default +
printk kernel thread" is accepted), there will be little delay between
___ratelimit() and ratelimit_reset() enough to make ratelimit_reset()
unnecessary. Also, "number of not-yet-flushed printk() records" will
become meaningful if printk() becomes asynchronous.

But "how fast the consoles" depends on what consoles will be added/removed
during messages are flushed, "how many lines will get filtered by a
console_loglevel" depends on whether someone will change it during
messages are flushed, "how many messages are being added by other CPUs
during the console_sem owner is sleeping" (or "how many bytes which will
need to be written to consoles") depends on stress at that moment. None of
these example information will be reliable for making ___ratelimit() decision.