MAP_HUGETLB and MPOL_PREFERRED = SIGBUS

From: Andy Lutomirski
Date: Thu Jul 18 2013 - 20:51:33 EST


When I mmap anonymous hugepages with MAP_HUGETLB and there are
available (pre-reserved) hugepages available, but only on the wrong
node, things blow up. The mmap succeeds, as it should (the accounting
here is wrong -- known issue AFAIK, but that's only relevant to
MPOL_BIND or cpusets). But writing to the resulting page causes a
SIGBUS.

AFAICS the issue is that dequeue_huge_page_vma is calling
huge_zonelist, which returns a single-entry nodemask. The loop over
allowable zones* will never try other numa zones, and the function
fails.

I'm not sure whether it would be better to try other nodes first or to
try get get a page from the buddy allocator on the preferred node
first, but currently the other nodes' reserved lists are never
checked. The result is a crash.

Working around this in userspace is going to be a real PITA. Grr.

* Why is this iterating zones instead of nodes?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/