Re: amusing SLUB compaction bug when CC_OPTIMIZE_FOR_SIZE

From: Vlastimil Babka
Date: Mon Oct 24 2022 - 11:58:17 EST


On 10/3/22 19:00, Matthew Wilcox wrote:
> On Sun, Oct 02, 2022 at 02:48:02PM +0900, Hyeonggon Yoo wrote:
>> Just one more thing, rcu_leak_callback too. RCU seem to use it
>> internally to catch double call_rcu().
>>
>> And some suggestions:
>> - what about adding runtime WARN() on slab init code to catch
>> unexpected arch/toolchain issues?
>> - instead of 4, we may use macro definition? like (PAGE_MAPPING_FLAGS + 1)?
>
> I think the real problem here is that isolate_movable_page() is
> insufficiently paranoid. Looking at the gyrations that GUP and the
> page cache do to convince themselves that the page they got really is
> the page they wanted, there are a few missing pieces (eg checking that
> you actually got a refcount on _this_ page and not some random other
> page you were temporarily part of a compound page with).
>
> This patch does three things:
>
> - Turns one of the comments into English. There are some others
> which I'm still scratching my head over.
> - Uses a folio to help distinguish which operations are being done
> to the head vs the specific page (this is somewhat an abuse of the
> folio concept, but it's acceptable)
> - Add the aforementioned check that we're actually operating on the
> page that we think we want to be.
> - Add a check that the folio isn't secretly a slab.
>
> We could put the slab check in PageMapping and call it after taking
> the folio lock, but that seems pointless. It's the acquisition of
> the refcount which stabilises the slab flag, not holding the lock.
>

I would like to have a working safe version in -next, even if we are able
simplify it later thanks to frozen refcounts. I've made a formal patch of
yours, but I'm still convinced the slab check needs to be more paranoid so
it can't observe a false positive __folio_test_movable() while missing the
folio_test_slab(), hence I added the barriers as in my previous attempt [1].
Does that work for you and can I add your S-o-b?

[1] https://lore.kernel.org/all/aec59f53-0e53-1736-5932-25407125d4d4@xxxxxxx/

----8<----