Re: amusing SLUB compaction bug when CC_OPTIMIZE_FOR_SIZE

From: Matthew Wilcox
Date: Tue Oct 04 2022 - 10:40:57 EST


On Tue, Oct 04, 2022 at 11:26:33PM +0900, Hyeonggon Yoo wrote:
> > It's the acquisition of
> > the refcount which stabilises the slab flag, not holding the lock.
>
> But can you please elaborate how this prevents race between
> allocation & initialization of a slab and isolate_movable_page()?
>
> Or maybe we can handle it with frozen folio as Vlastimil suggested? ;-)

Yes, we discussed that a little yesterday. I'm hoping to have a
refreshed patchset for frozen folios out today. Some of this patch
is still needed, even if we go that route.

> > @@ -91,8 +99,8 @@ int isolate_movable_page(struct page *page, isolate_mode_t mode)
> > * lets be sure we have the page lock
> > * before proceeding with the movable page isolation steps.
> > */
> > - if (unlikely(!trylock_page(page)))
> > - goto out_putpage;
> > + if (unlikely(!folio_trylock(folio)))
> > + goto out_put;
>
> I don't know much about callers that this is trying to avoid race aginst...
>
> But for this to make sense, I think *every users* that doing their stuff with
> sub-page of a compound page should acquire folio lock and not page lock
> of sub-page, right?

There is no page lock per se. If you try to acquire the lock on a tail
page, it acquires the lock on its head page. It's been that way for a
very long time. A lot of people are confused by this, which was part of
the motivation for making it explicit with folios.