Re: [PATCH 1/2] random: Omit double-printing ratelimit messages

From: Dmitry Safonov
Date: Wed May 16 2018 - 11:13:41 EST


On Fri, 2018-05-11 at 13:41 +0100, Dmitry Safonov wrote:
> On Thu, 2018-05-10 at 23:51 -0400, Theodore Y. Ts'o wrote:
> > On Thu, May 10, 2018 at 08:50:07PM +0100, Dmitry Safonov wrote:
> > > random uses __ratelimit() which calls ___ratelimit() with a
> > > function
> > > name. Depending on !RATELIMIT_MSG_ON_RELEASE it prints how many
> > > messages were suppressed every ratelimit interval (1 second for
> > > random)
> > > and flushes ratelimit_state::missed:
> >
> > So the thing about the ratelimit system is that if you have a burst
> > of
> > 1,000,000 within the one second burst window, and then nothing ever
> > again, you will never see a message accounting for those 1,000,000
> > "callbacks" (which is a terrible wording; it just confuses people).
> >
> > If you have a burst of 1,000,000 calls to ratelimit, and then a
> > month
> > goes by, and *then* a single call to __ratelimit is called by
> > printk,
> > only *the* does the message about the suppressed "callback" get
> > printed.
>
> Yes, if the initialization time of driver is lesser than a second.
> If it's 1.1 sec - you'll have prints from ___ratelimit() and from
> your
> driver too. And they differ in format.
>
> > So in the case of the random driver, once the random driver is
> > fully
> > intialized, there will never be a call to __ratelimit() for the
> > urandom ratelimit structures, so we manually print out the final
> > number of suppressed message so there is proper accounting for
> > them.
>
> So, there is this flag, AFAICS, which perfectly fits your purpose
> RATELIMIT_MSG_ON_RELEASE which will omit printing how many messages
> were suppressed every interval of ___ratelimit(). And that will allow
> you to print total in the end.
>
> > It is not a double-count. If we didn't do that, those suppressed
> > warnings would never be mentioned by the kernel.
>
> Yeah, but what you print is not total sum, it's since the last
> interval
> because without mentioned flag ___ratelimit() will flush missed
> counter
> and print "suppressed" message.
> They might even double if say other proccess has called
> get_random_bytes() got to ___ratelimit() and got preempted. This
> thread
> finishes initializing random driver and prints this not-proper-sum
> statistics, and when the code flow is back in the first context, it
> will print statistics again from ___ratelimit() function.

So, does it make sense to you, Theodore?
If not - I'll just resend second patch rebasing and dropping this one.

--
Thanks,
Dmitry