Re: [PATCH v2] xfs: remove kmem_zalloc_greedy
From: Michal Hocko
Date: Wed Mar 15 2017 - 04:35:41 EST
On Wed 15-03-17 01:14:27, Luis R. Rodriguez wrote:
> On Tue, Mar 14, 2017 at 11:07:38AM -0700, Darrick J. Wong wrote:
> > On Tue, Mar 14, 2017 at 05:57:45PM +0100, Luis R. Rodriguez wrote:
> > > On Tue, Mar 07, 2017 at 04:35:28PM -0800, Darrick J. Wong wrote:
> > > > The sole remaining caller of kmem_zalloc_greedy is bulkstat, which uses
> > > > it to grab 1-4 pages for staging of inobt records. The infinite loop in
> > > > the greedy allocation function is causing hangs[1] in generic/269, so
> > > > just get rid of the greedy allocator in favor of kmem_zalloc_large.
> > > > This makes bulkstat somewhat more likely to ENOMEM if there's really no
> > > > pages to spare, but eliminates a source of hangs.
> > > >
> > > > [1] http://lkml.kernel.org/r/20170301044634.rgidgdqqiiwsmfpj%40XZHOUW.usersys.redhat.com
> > > >
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > ---
> > > > v2: remove single-page fallback
> > > > ---
> > >
> > > Since this fixes a hang how about *at the very least* a respective Fixes tag ?
> > > This fixes an existing hang so what are the stable considerations here ? I
> > > realize the answer is not easy but figured its worth asking.
> >
> > I didn't think it was appropriate to "Fixes: 77e4635ae1917" since we're
> > not fixing _greedy so much as we are killing it. The patch fixes an
> > infinite retry hang when bulkstat tries a memory allocation that cannot
> > be satisfied; and having done that, realizes there are no remaining
> > callers of _greedy and garbage collects it. The code that was there
> > before also seems capable of sleeping forever, I think.
> >
> > So the minimally invasive fix is to apply the allocation conversion in
> > bulkstat, and if there aren't any other callers of _greedy then you can
> > get rid of it too.
>
> For the stake of stable XFS users then why not do the less invasive change
> first, Cc stable, and then move on to the less backward portable solution ?
The thing is that the permanent failures for vmalloc were so unlikely
prior to 5d17a73a2ebe ("vmalloc: back off when the current task is
killed") that this was basically a non-issue before this (4.11) merge
window.
--
Michal Hocko
SUSE Labs