Re: [PATCH v4] sched/numa, mm: do not try to migrate memory to memoryless nodes

From: Byungchul Park
Date: Mon Feb 19 2024 - 23:10:20 EST


On Mon, Feb 19, 2024 at 07:28:41PM -0800, Andrew Morton wrote:
> On Tue, 20 Feb 2024 11:33:04 +0900 Byungchul Park <byungchul@xxxxxx> wrote:
>
> > > Yes, this changelog is missing rather a lot of important information.
> > >
> > > I pulled together the below, please check.
> >
> > To make it more clear, I need to explain it more. I posted the following
> > two patches while resolving the oops issue. However, two are going on
> > for different purposes.
> >
> > 1) https://lkml.kernel.org/r/20240219041920.1183-1-byungchul@xxxxxx
> >
> > I started this patch as the fix for the oops. However, I found the
> > root cause comes from using -1 as an array index. So let the root
> > cause fix go with another thread, 2). Nevertheless, 1) is still
> > necessary as a *reasonable optimization* but not the real fix any
> > more.
>
> Well I altered this patch's changelog to tell readers that it is an
> optimization. But one does wonder why it isn't simply a bugfix.
> Attempting to migrate to a memoryless node is clearly as error.

I agree with what Oscar Salvador said:

"As this is not a bug fix but an optimization, as we will fail anyways
in migrate_misplaced_folio() when migrate_balanced_pgdat() notices
that we do not have any memory on that node."

https://lore.kernel.org/lkml/ZdG1yO29WTyRiw8Q@localhost.localdomain/

So assuming all the related code works correctly, the migration will
safely fail even without this optimization patch.

Byungchul

> Presumably the called code handles it somehow, but in what fashion and
> at what cost?
>
> > 2) https://lkml.kernel.org/r/20240216111502.79759-1-byungchul@xxxxxx
> >
> > I found the root cause of the oops comes from using -1 as an array
> > index. So moved all the oops message, Fixes: tag, and cc stable to
> > here. Long story short, 2) is the *real fix* for the oops.
> >