On Wed 21-09-16 14:35:34, Gerald Schaefer wrote:
dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
list corruption and addressing exception when trying to set a memory
block offline that is part (but not the first part) of a hugetlb page
with a size > memory block size.
When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
will trigger directly. In the other case we will run into an addressing
exception later, because dissolve_free_huge_page() will not work on the
head page of the compound hugetlb page which will result in a NULL
hstate from page_hstate().
To fix this, first remove the VM_BUG_ON() because it is wrong, and then
use the compound head page in dissolve_free_huge_page().
OK so dissolve_free_huge_page will work also on tail pages now which
makes some sense. I would appreciate also few words why do we want to
sacrifice something as precious as gigantic page rather than fail the
page block offline. Dave pointed out dim offline usecase for example.
Also change locking in dissolve_free_huge_page(), so that it only takes
the lock when actually removing a hugepage.
From a quick look it seems this has been broken since introduced by
c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle
hugepage"). Do we want to have this backported to stable? In any way
Fixes: SHA1 would be really nice.