Re: Is it possible to implement the per-node page cache for programs/libraries?

From: Linus Torvalds
Date: Fri Sep 03 2021 - 15:08:28 EST


On Fri, Sep 3, 2021 at 12:02 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> Was there a reason you chose to do it that way instead of having per-node
> i_mapping pointers?

You can't have per-node i_mapping pointers without huge coherence issues.

If you don't care about coherence, that's fine - but that has to be a
user-space decision (ie "I will just replicate this file").

You can't just have the kernel decide "I'll map this set of pages on
this node, and that other ser of pages on that other node", in case
there's MAP_SHARED things going on.

Anyway, I think very fundamentally this is one of those things where
99.9% of all people don't care, and DO NOT WANT the complexity.

And the 0.1% that _does_ care really could and should do this in user
space, because they know they care.

Asking the kernel to do complex things in critical core functions for
something that is very very rare and irrelevant to most people, and
that can and should just be done in user space for the people who care
is the wrong approach.

Because the question here really should be "is this truly important,
and does this need kernel help because user space simply cannot do it
itself".

And the answer is a fairly simple "no".

Linus