Re: [PATCH] mm: be more verbose for alloc_contig_range faliures

From: Michal Hocko
Date: Thu Feb 18 2021 - 06:42:15 EST


On Thu 18-02-21 10:43:21, David Hildenbrand wrote:
> On 18.02.21 10:35, Michal Hocko wrote:
> > On Thu 18-02-21 10:02:43, David Hildenbrand wrote:
> > > On 18.02.21 09:56, Michal Hocko wrote:
> > > > On Wed 17-02-21 08:36:03, Minchan Kim wrote:
> > > > > alloc_contig_range is usually used on cma area or movable zone.
> > > > > It's critical if the page migration fails on those areas so
> > > > > dump more debugging message like memory_hotplug unless user
> > > > > specifiy __GFP_NOWARN.
> > > >
> > > > I agree with David that this has a potential to generate a lot of output
> > > > and it is not really clear whether it is worth it. Page isolation code
> > > > already has REPORT_FAILURE mode which currently used only for the memory
> > > > hotplug because this was just too noisy from the CMA path - d381c54760dc
> > > > ("mm: only report isolation failures when offlining memory").
> > > >
> > > > Maybe migration failures are less likely to fail but still.
> > >
> > > Side note: I really dislike that uncontrolled error reporting on memory
> > > offlining path we have enabled as default. Yeah, it might be useful for
> > > ZONE_MOVABLE in some cases, but otherwise it's just noise.
> > >
> > > Just do a "sudo stress-ng --memhotplug 1" and see the log getting flooded
> >
> > Anyway we can discuss this in a separate thread but I think this is not
> > a representative workload.
>
> Sure, but the essence is "this is noise", and we'll have more noise on
> alloc_contig_range() as we see these calls more frequently. There should be
> an explicit way to enable such *debug* messages.

There is a dynamic debugging framework available. I do not have much of
an exprience there but maybe that is the way to go.
--
Michal Hocko
SUSE Labs