From: Oscar Salvador <osalvador@xxxxxxx>
This patch changes the way we set and handle in-use poisoned pages.
Until now, poisoned pages were released to the buddy allocator, trusting
that the checks that take place prior to hand the page would act as a
safe net and would skip that page.
This has proved to be wrong, as we got some pfn walkers out there, like
compaction, that all they care is the page to be PageBuddy and be in a
freelist.
Although this might not be the only user, having poisoned pages
in the buddy allocator seems a bad idea as we should only have
free pages that are ready and meant to be used as such.
Before explaining the taken approach, let us break down the kind
of pages we can soft offline.
- Anonymous THP (after the split, they end up being 4K pages)
- Hugetlb
- Order-0 pages (that can be either migrated or invalited)
* Normal pages (order-0 and anon-THP)
- If they are clean and unmapped page cache pages, we invalidate
then by means of invalidate_inode_page().
- If they are mapped/dirty, we do the isolate-and-migrate dance.
Either way, do not call put_page directly from those paths.
Instead, we keep the page and send it to page_set_poison to perform the
right handling.
page_set_poison sets the HWPoison flag and does the last put_page.
This call to put_page is mainly to be able to call __page_cache_release,
since this function is not exported.
Down the chain, we placed a check for HWPoison page in
free_pages_prepare, that just skips any poisoned page, so those pages
do not end up in any pcplist/freelist.
After that, we set the refcount on the page to 1 and we increment
the poisoned pages counter.
We could do as we do for free pages:
1) wait until the page hits buddy's freelists
2) take it off
3) flag it
The problem is that we could race with an allocation, so by the time we
want to take the page off the buddy, the page is already allocated, so we
cannot soft-offline it.
This is not fatal of course, but if it is better if we can close the race
as does not require a lot of code.
* Hugetlb pages
- We isolate-and-migrate them
After the migration has been successful, we call dissolve_free_huge_page,
and we set HWPoison on the page if we succeed.
Hugetlb has a slightly different handling though.
While for non-hugetlb pages we cared about closing the race with an
allocation, doing so for hugetlb pages requires quite some additional
code (we would need to hook in free_huge_page and some other places).
So I decided to not make the code overly complicated and just fail
normally if the page we allocated in the meantime.
Because of the way we handle now in-use pages, we no longer need the
put-as-isolation-migratetype dance, that was guarding for poisoned pages
to end up in pcplists.
Signed-off-by: Oscar Salvador <osalvador@xxxxxxx>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>