Re: [PATCH] ARM: mm: fix no-MMU ZERO_PAGE() implementation

From: Giulio Benetti
Date: Wed Oct 19 2022 - 12:38:05 EST

Next message: Vinod Koul: "Re: [PATCH 07/33] dmaengine: at_hdmac: Fix at_lli struct definition"
Previous message: Sean Christopherson: "Re: [V4 6/8] KVM: selftests: add library for creating/interacting with SEV guests"
In reply to: Arnd Bergmann: "Re: [PATCH] ARM: mm: fix no-MMU ZERO_PAGE() implementation"
Next in thread: Russell King (Oracle): "Re: [PATCH] ARM: mm: fix no-MMU ZERO_PAGE() implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 19/10/22 09:00, Arnd Bergmann wrote:

On Wed, Oct 19, 2022, at 00:32, Giulio Benetti wrote:

On 18/10/22 20:35, Arnd Bergmann wrote:

On Tue, Oct 18, 2022, at 19:44, Giulio Benetti wrote:

On 18/10/22 09:03, Arnd Bergmann wrote:

In addition to your fix, I see that arm is the only architecture
that defines 'empty_zero_page' as a pointer to the page, when
everything else just makes it a pointer to the data itself,
or an 'extern char empty_zero_page[]' array, which we may want
to change for consistency.

I was about doing it, but then I tought to move one piece at a time.

Right, it would definitely be a separate patch, but it
can be a series of two patches. We probably wouldn't need to
backport the second patch that turns it into a static allocation.

I've sent the patchset of 2:
https://lore.kernel.org/all/20221018222503.90118-1-giulio.benetti@xxxxxxxxxxxxxxxxxxxxxx/T/#t

I'm wondering if it makes sense to send a patchset for all those
architectures that have only one zero page. I've seen that for example
loongarch has more than one. But for the others I find the array
approach more linear, with less code all around and a bit faster in term
of code execution(of course really few, but better than nothing) since
that array is in .bss, so it will be zeroed earlier during a long
"memset" where assembly instructions for zeroing 8 bytes at a time are
used. What about this?

The initial zeroing should not matter at all in terms of performance,
I think the only question is whether one wants a single zero page
to be used everywhere or one per NUMA node to give better locality
for a cache miss.

My guess is that for a system with 4KB pages, all the data
in the zero page are typically available in a CPU cache already,
so it doesn't matter, but it's possible that some machines benefit
from having per-node pages when the page size isn't tiny compared
to the typical cache sizes.

We should probably not touch this for any of the other architectures.

Ok, thanks for the explanation!

Best regards
--
Giulio Benetti
CEO/CTO@Benetti Engineering sas

Next message: Vinod Koul: "Re: [PATCH 07/33] dmaengine: at_hdmac: Fix at_lli struct definition"
Previous message: Sean Christopherson: "Re: [V4 6/8] KVM: selftests: add library for creating/interacting with SEV guests"
In reply to: Arnd Bergmann: "Re: [PATCH] ARM: mm: fix no-MMU ZERO_PAGE() implementation"
Next in thread: Russell King (Oracle): "Re: [PATCH] ARM: mm: fix no-MMU ZERO_PAGE() implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]