Re: [PATCH] Add nid sanity on alloc_pages_node
From: Hugh Dickins
Date: Tue Jul 17 2007 - 13:27:54 EST
On Tue, 17 Jul 2007, Andrew Morton wrote:
> On Tue, 17 Jul 2007 16:04:54 +0100 (BST) Hugh Dickins <hugh@xxxxxxxxxxx> wrote:
> > On Thu, 12 Jul 2007, Andrew Morton wrote:
> > >
> > > It'd be much better to fix the race within alloc_fresh_huge_page(). That
> > > function is pretty pathetic.
> > >
> > > Something like this?
> > >
> > > --- a/mm/hugetlb.c~a
> > > +++ a/mm/hugetlb.c
> > > @@ -105,13 +105,20 @@ static void free_huge_page(struct page *
> > >
> > > static int alloc_fresh_huge_page(void)
> > > {
> > > - static int nid = 0;
> > > + static int prev_nid;
> > > + static DEFINE_SPINLOCK(nid_lock);
> > > struct page *page;
> > > - page = alloc_pages_node(nid, htlb_alloc_mask|__GFP_COMP|__GFP_NOWARN,
> > > - HUGETLB_PAGE_ORDER);
> > > - nid = next_node(nid, node_online_map);
> > > + int nid;
> > > +
> > > + spin_lock(&nid_lock);
> > > + nid = next_node(prev_nid, node_online_map);
> > > if (nid == MAX_NUMNODES)
> > > nid = first_node(node_online_map);
> > > + prev_nid = nid;
> > > + spin_unlock(&nid_lock);
> > > +
> > > + page = alloc_pages_node(nid, htlb_alloc_mask|__GFP_COMP|__GFP_NOWARN,
> > > + HUGETLB_PAGE_ORDER);
> > > if (page) {
> > > set_compound_page_dtor(page, free_huge_page);
> > > spin_lock(&hugetlb_lock);
> >
> > Now that it's gone into the tree, I look at it and wonder, does your
> > nid_lock really serve any purpose? We're just doing a simple assignment
> > to prev_nid, and it doesn't matter if occasionally two racers choose the
> > same node, and there's no protection here against a node being offlined
> > before the alloc_pages_node anyway (unsupported? I'm ignorant).
>
> umm, actually, yes, the code as it happens to be structured does mean that
> ther is no longer a way in which a race can cause us to pass MAX_NUMNODES
> into alloc_pages_node().
>
> Or not. We can call next_node(MAX_NUMNODES, node_online_map) in that race
> window, with perhaps bad results.
>
> I think I like the lock ;)
I hate to waste your time, but I'm still puzzled. Wasn't the race fixed
by your changeover from use of "static int nid" throughout, to setting
local "int nid" from "static int prev_nid", working with nid, then
setting prev_nid from nid at the end? What does the lock add to that?
Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/