Re: mm: race in put_and_wait_on_page_locked()

From: Linus Torvalds
Date: Tue Feb 05 2019 - 02:16:29 EST


On Mon, Feb 4, 2019 at 8:43 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> Something I shall not be doing, is verifying the correctness of the
> low-level get_page_unless_zero() versus page_ref_freeze() protocol
> on arm64 and power - nobody has reported on x86, and I do wonder if
> there's a barrier missing somewhere, that could manifest in this way -
> but I'm unlikely to be the one to find that (and also think that any
> weakness there should have shown up long before now).

Remind me what the page_ref_freeze() rules even _are_?

It's a very special thing, setting the page count down to zero if it
matches the "expected" count.

Now, if another CPU does a put_page() at that point, that certainly
will hit the "oops, we dropped the ref to something that was zero".

So the "expected" count had better be only references we have and own
100%, but some of those references aren't really necessarily private
to our thread.

For example, what happens if

(a) one CPU is doing migration_entry_wait() (counting expected page
refs etc, before doing page_ref_freeze)

(b) another CPU is dirtying a page that was in the swap cache and
takes a reference to it, but drops it from the swap cache

Note how (b) does not change the refcount on the page at all, because
it just moves the ref-count from "swap cache entry" to "I own the page
in my page tables". Which means that when (a) does the "count expected
count, and match it", it happily matches, and the page_ref_freeze()
succeeds and makes the page count be zero.

But now (b) has a private reference to that page, and can drop it, so
the "freeze" isn't a freeze at all.

Ok, so clearly the above cannot happen, and there's something I'm
missing with the freezing. I think we hold the page lock while this is
going on, which means those two things cannot happen at the same time.
But maybe there is something else that does the above kind of "move
page ref from one owner to another"?

The page_ref_freeze() rules don't seem to be documented anywhere.

Linus