Re: [PATCH] hugetlbfs: change put_page/unlock_page order in hugetlbfs_fallocate()
From: Nadav Amit
Date: Sun Aug 27 2017 - 16:09:42 EST
Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote:
> On 08/26/2017 12:11 PM, Nadav Amit wrote:
>> hugetlfs_fallocate() currently performs put_page() before unlock_page().
>> This scenario opens a small time window, from the time the page is added
>> to the page cache, until it is unlocked, in which the page might be
>> removed from the page-cache by another core. If the page is removed
>> during this time windows, it might cause a memory corruption, as the
>> wrong page will be unlocked.
>>
>> It is arguable whether this scenario can happen in a real system, and
>> there are several mitigating factors. The issue was found by code
>> inspection (actually grep), and not by actually triggering the flow.
>> Yet, since putting the page before unlocking is incorrect it should be
>> fixed, if only to prevent future breakage or someone copy-pasting this
>> code.
>>
>> Fixes: 70c3547e36f5c ("hugetlbfs: add hugetlbfs_fallocate()")
>>
>> cc: Eric Biggers <ebiggers3@xxxxxxxxx>
>> cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
>>
>> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx>
>
> Thank you Nadav.
No problem.
>
> Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
>
> Since hugetlbfs is an in memory filesystem, the only way one 'should' be
> able to remove a page (file content) is through an inode operation such as
> truncate, hole punch, or unlink. That was the basis for my response that
> the inode lock would be required for page freeing.
>
> Eric's question about sys_fadvise64(POSIX_FADV_DONTNEED) is interesting.
> I was expecting to see a check for hugetlbfs pages and exit (without
> modification) if encountered. A quick review of the code did not find
> any such checks.
>
> I'll take a closer look to determine exactly how hugetlbfs files are
> handled. IMO, there should be something similar to the DAX check where
> the routine quickly exits.
I did not cc stable when submitting the patch, based on your previous
response. Let me know if you want me to send v2 which does so.
Thanks,
Nadav