Re: [PATCH] arm64/mm: adds soft dirty page tracking

From: David Hildenbrand
Date: Fri Mar 15 2024 - 05:30:45 EST


On 12.03.24 23:32, Shivansh Vij wrote:
Hi David,

On Tue, Mar 12, 2024 at 09:22:25AM +0100, David Hildenbrand wrote:
On 12.03.24 02:16, Shivansh Vij wrote:

Hi,

Checkpoint-Restore in Userspace (CRIU) needs to be able
to track a memory page's changes if we want to enable
pre-dumping, which is important for live migrations.

The PTE_DIRTY bit (defined in pgtable-prot.h) is already
used to track software dirty pages, and the PTE_WRITE and
PTE_READ bits are used to track hardware dirty pages.

This patch enables full soft dirty page tracking
(including swap PTE support) for arm64 systems, and is
based very closely on the x86 implementation.

It is based on an unfinished patch by
Bin Lu (bin.lu@xxxxxxx) from 2017
(https://patchwork.kernel.org/project/linux-arm-kernel/patch/1512029649-61312-1-git-send-email-bin.lu@xxxxxxx/),
but has been updated for newer 6.x kernels as well as
tested on various 5.x kernels.

There has also been more recently:

https://lore.kernel.org/lkml/20230703135526.930004-1-npache@xxxxxxxxxx/#r

I recall that we are short on SW PTE bits:

"
So if you need software dirty, it can only be done with another software
PTE bit. The problem is that we are short of such bits (only one left if
we move PTE_PROT_NONE to a different location). The userfaultfd people
also want such bit.

Personally I'd reuse the four PBHA bits but I keep hearing that they may
be used with some out of tree patches.
"

https://lore.kernel.org/lkml/ZLQIaSMI74KpqsQQ@xxxxxxx/

If I'm understanding the previous discussion (https://patchwork.kernel.org/project/linux-arm-kernel/patch/20230703135526.930004-1-npache@xxxxxxxxxx/) correctly, the core issue is that we actually do need to use a special SW PTE bit (like the PTE_SOFT_DIRTY that's in this patch) - but at the same time, the PTE bits are highly contentious so it would be ideal if we could reuse an existing bit (maybe one of the PBHA bits like you suggested) instead of creating a new one.
Is my understanding correct?

Yes, that matches my understanding. As Joey noted, the bit you chose is defined by HW and might soon get used.

As Catalin wrote, some OOT patches might use the PBHA bits; although I am not sure what the latest state on that is and if we really should care about OOT patches. Maybe it would be good enough to allow driver use only in PFNMAP mappings, and simply not use the bit for softdirty/uffd-wp in there.

I don't know much about PBHA, this [1] never got merged but is an interesting read. We are certainly short on sw bits in any case.

There was recently some discussions around why soft-dirty tracking is not suitable (unfixable) for some cases, buried in previous iterations of [2]. The outcome of that was a new UFFD_FEATURE_WP_ASYNC mode as a replacement for soft-dirty tracking.

So long-term, avoiding introducing soft-dirty tracking and instead supporting uffd-wp might be the better choice on arm64.

[1] https://lkml.kernel.org/r/20211015161416.2196-8-james.morse@xxxxxxx
[2] https://lore.kernel.org/all/20230821141518.870589-1-usama.anjum@xxxxxxxxxxxxx/

--
Cheers,

David / dhildenb