Re: [PATCH] mm: don't warn about allocations which stall for too long

From: Tetsuo Handa
Date: Thu Nov 09 2017 - 05:19:53 EST


Michal Hocko wrote:
> On Thu 09-11-17 10:34:46, peter enderborg wrote:
> > On 11/09/2017 09:52 AM, Michal Hocko wrote:
> > > I am not sure. I would rather see a tracepoint to mark the allocator
> > > entry. This would allow both 1) measuring the allocation latency (to
> > > compare it to the trace_mm_page_alloc and 2) check for stalls with
> > > arbitrary user defined timeout (just print all allocations which haven't
> > > passed trace_mm_page_alloc for the given amount of time).
> >
> > Traces are not that expensive, but there are more than few in calls
> > in this path. And Im trying to keep it as small that it can used for
> > maintenance versions too.
> >
> > This is suggestion is a quick way of keeping the current solution for
> > the ones that are interested the slow allocations. If we are going
> > for a solution with a time-out parameter from the user what interface
> > do you suggest to do this configuration. A filter parameter for the
> > event?
>
> I meant to do all that in postprocessing. So no specific API is needed,
> just parse the output. Anyway, it seems that the printk will be put in
> shape in a forseeable future so we might preserve the stall warning
> after all. It is the show_mem part which is interesting during that
> warning.

I don't know whether printk() will be put in shape in a foreseeable future.
The rule that "do not try to printk() faster than the kernel can write to
consoles" will remain no matter how printk() changes. Unless asynchronous
approach like https://lwn.net/Articles/723447/ is used, I think we can't
obtain useful information.