Re: [RFC PATCH] mm: khugepaged: don't carry huge page to the next loop for !CONFIG_NUMA

From: Yang Shi
Date: Wed Sep 22 2021 - 23:08:07 EST


On Wed, Sep 22, 2021 at 4:49 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> On Wed, 1 Sep 2021, Yang Shi wrote:
> > On Wed, Sep 1, 2021 at 3:26 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
> > > On 9/1/21 05:46, Yang Shi wrote:
> > > > On Tue, Aug 31, 2021 at 4:38 PM Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
> > > >> On Mon, Aug 30, 2021 at 11:49:43AM -0700, Yang Shi wrote:
> > > >> > Gently ping...
> > > >> >
> > > >> > Does this patch make sense? BTW, I have a couple of other khugepaged
> > > >> > related patches in my queue. I plan to send them with this patch
> > > >> > together. It would be great to hear some feedback before resending
> > > >> > this one.
> > > >>
> > > >> I don't really care for !NUMA optimization. I believe that most of setups
> > > >> that benefit from THP has NUMA enabled compile time.
> > > >
> > > > Agreed.
> > > >
> > > >>
> > > >> But if you wanna to go this path, make an effort to cleanup other
> > > >> artifacts for the !NUMA optimization: the ifdef has to be gone and all
> > > >> callers of these helpers has to be revisited. There's more opportunities to
> > > >> cleanup. Like it is very odd that khugepaged_prealloc_page() frees the
> > > >> page.
> > > >
> > > > Yes, they are gone in this patch. The only remaining for !NUMA is
> > > > khugepaged_find_target_node() which just returns 0.
> > >
> > > As Kirill pointed out, there's also khugepaged_prealloc_page() where the
> > > only remaining variant does actually no preallocation, just freeing of an
> > > unused page and some kind of "sleep after first alloc fail, break after
> > > second alloc fail" logic.
> > > This could now be moved to khugepaged_do_scan() loop itself and maybe it
> > > will be easier to follow.
> >
> > Aha, I see. Misunderstood him. I'm supposed that you mean move into
> > khugepaged_scan_mm_slot().
>
> It may not be possible, but I'd always imagined that a cleanup of this
> kind would get rid of all those "struct page **hpage" artifacts.

It seems we need to find another way to do "sleep for the first alloc
failure, break loop for the second alloc failure" or just remove the
heuristic.

I will take a closer look once I find some time.

>
> Hugh