Re: [PATCH 1/2] hugetlb: fix update_and_free_page contig page struct assumption

From: Mike Kravetz
Date: Thu Feb 18 2021 - 16:44:52 EST


On 2/18/21 9:34 AM, Mike Kravetz wrote:
> On 2/18/21 9:25 AM, Jason Gunthorpe wrote:
>> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote:
>>>>> page structs are not guaranteed to be contiguous for gigantic pages. The
>>>>
>>>> June 2014. That's a long lurk time for a bug. I wonder if some later
>>>> commit revealed it.
>>>
>>> I would suggest that gigantic pages have not seen much use. Certainly
>>> performance with Intel CPUs on benchmarks that I've been involved with
>>> showed lower performance with 1GB pages than with 2MB pages until quite
>>> recently.
>>
>> I suggested in another thread that maybe it is time to consider
>> dropping this "feature"
>>
>> If it has been slightly broken for 7 years it seems a good bet it
>> isn't actually being used.
>>
>> The cost to fix GUP to be compatible with this will hurt normal
>> GUP performance - and again, that nobody has hit this bug in GUP
>> further suggests the feature isn't used..
>
> I was thinking that we could detect these 'unusual' configurations and only
> do the slower page struct walking in those cases. However, we would need to
> do some research to make sure we have taken into account all possible config
> options which can produce non-contiguous page structs. That should have zero
> performance impact in the 'normal' cases.

What about something like the following patch, and making all code that
wants to scan gigantic page subpages use mem_map_next()?