Re: [PATCH 3/7] mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED
From: Vlastimil Babka
Date: Fri Dec 04 2020 - 12:26:15 EST
On 12/1/20 12:35 PM, Oscar Salvador wrote:
> On Wed, Nov 25, 2020 at 07:20:33PM +0100, Vlastimil Babka wrote:
>> On 11/19/20 11:57 AM, Oscar Salvador wrote:
>> > From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
>> >
>> > The call to get_user_pages_fast is only to get the pointer to a struct
>> > page of a given address, pinning it is memory-poisoning handler's job,
>> > so drop the refcount grabbed by get_user_pages_fast().
>> >
>> > Note that the target page is still pinned after this put_page() because
>> > the current process should have refcount from mapping.
>>
>> Well, but can't it go away due to reclaim, migration or whatever?
>
> Yes, it can.
>
>> > @@ -900,20 +900,23 @@ static int madvise_inject_error(int behavior,
>> > */
>> > size = page_size(compound_head(page));
>> > + /*
>> > + * The get_user_pages_fast() is just to get the pfn of the
>> > + * given address, and the refcount has nothing to do with
>> > + * what we try to test, so it should be released immediately.
>> > + * This is racy but it's intended because the real hardware
>> > + * errors could happen at any moment and memory error handlers
>> > + * must properly handle the race.
>>
>> Sure they have to. We might just be unexpectedly messing with other process'
>> memory. Or does anything else prevent that?
>
> No, nothing does, and I have to confess that I managed to confuse myself here.
> If we release such page and that page ends up in buddy, nothing prevents someone
> else to get that page, and then we would be messing with other process memory.
>
> I guess the right thing to do is just to make sure we got that page and that
> that page remains pinned as long as the memory failure handling goes.
OK, so that means we don't introduce this race for MADV_SOFT_OFFLINE, but it's
already (and still) there for MADV_HWPOISON since Dan's 23e7b5c2e271 ("mm,
madvise_inject_error: Let memory_failure() optionally take a page reference") no?
> I will remove those patches from the patchset and re-submit with only the
> refactoring and pcp-disabling.
>
> Thanks Vlastimil
>