Re: [patch 1/6] md: remove dependency on __GFP_NOFAIL

From: Andrew Morton
Date: Mon Aug 23 2010 - 15:52:09 EST


On Mon, 23 Aug 2010 12:35:22 -0700 (PDT)
David Rientjes <rientjes@xxxxxxxxxx> wrote:

> On Mon, 23 Aug 2010, Andrew Morton wrote:
>
> > > diff --git a/drivers/md/dm-region-hash.c b/drivers/md/dm-region-hash.c
> > > --- a/drivers/md/dm-region-hash.c
> > > +++ b/drivers/md/dm-region-hash.c
> > > @@ -289,8 +289,12 @@ static struct dm_region *__rh_alloc(struct dm_region_hash *rh, region_t region)
> > > struct dm_region *reg, *nreg;
> > >
> > > nreg = mempool_alloc(rh->region_pool, GFP_ATOMIC);
> > > - if (unlikely(!nreg))
> > > - nreg = kmalloc(sizeof(*nreg), GFP_NOIO | __GFP_NOFAIL);
> > > + if (unlikely(!nreg)) {
> > > + /* FIXME: this may potentially loop forever */
> > > + do {
> > > + nreg = kmalloc(sizeof(*nreg), GFP_NOIO);
> > > + } while (!nreg);
> > > + }
> > >
> > > nreg->state = rh->log->type->in_sync(rh->log, region, 1) ?
> > > DM_RH_CLEAN : DM_RH_NOSYNC;
> >
> > erm.
> >
> > The reason for adding GFP_NOFAIL in the first place was my observation
> > that the kernel contained lots of open-coded retry-for-ever loops.
> >
> > All of these are wrong, bad, buggy and mustfix. So we consolidated the
> > wrongbadbuggymustfix concept into the core MM so that miscreants could
> > be easily identified and hopefully fixed.
> >
>
> That consolidation would have been unnecessary, then, since all
> allocations with order < PAGE_ALLOC_COSTLY_ORDER automatically loop
> indefinitely in the page allocator.

The difference is that an order-0 !__GFP_NOFAIL allocation attempt can
fail due to oom-killing. Unless someone broke that.

> struct dm_region allocations would
> already do that.
>
> So this retry loop doesn't actually do anything that the page allocator
> already doesn't, with or without __GFP_NOFAIL. The difference here is
> that
>
> - it doesn't depend on the page allocator's implementation, which may
> change over time, and
>
> - it adds documentation so that the subsystems doing these loops can
> (hopefully) fix these problems later, although their appear to be
> geniune cases where little other options are available.
>
> > I think that simply undoing that change is a bad idea - it allows the
> > wrongbadbuggymustfix code to hide from view.
> >
>
> It removes several branches from the page allocator.

None on the fast path.

> > The correct way to remove __GFP_NOFAIL is to fix the
> > wrongbadbuggymustfix code properly.
> >
>
> If the prerequisite for removing __GFP_NOFAIL is that nobody must ever
> loop indefinitely looking for memory or smaller order allocations don't
> implicitly retry, then there's little chance it'll ever get removed since
> they've existed for years without anybody cleaning them up.

The JBD one is hard - I haven't looked at the others.

We should fix them.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/