Re: [PATCH 5/5] mm: Soft-dirty bits for user memory changestracking

From: Andrew Morton
Date: Mon Apr 15 2013 - 17:46:25 EST


On Fri, 12 Apr 2013 17:14:03 +0400 Pavel Emelyanov <xemul@xxxxxxxxxxxxx> wrote:

> On 04/12/2013 01:24 AM, Andrew Morton wrote:
> > On Thu, 11 Apr 2013 15:30:00 +0400 Pavel Emelyanov <xemul@xxxxxxxxxxxxx> wrote:
> >
> >> The soft-dirty is a bit on a PTE which helps to track which pages a task
> >> writes to. In order to do this tracking one should
> >>
> >> 1. Clear soft-dirty bits from PTEs ("echo 4 > /proc/PID/clear_refs)
> >> 2. Wait some time.
> >> 3. Read soft-dirty bits (55'th in /proc/PID/pagemap2 entries)
> >>
> >> To do this tracking, the writable bit is cleared from PTEs when the
> >> soft-dirty bit is. Thus, after this, when the task tries to modify a page
> >> at some virtual address the #PF occurs and the kernel sets the soft-dirty
> >> bit on the respective PTE.
> >>
> >> Note, that although all the task's address space is marked as r/o after the
> >> soft-dirty bits clear, the #PF-s that occur after that are processed fast.
> >> This is so, since the pages are still mapped to physical memory, and thus
> >> all the kernel does is finds this fact out and puts back writable, dirty
> >> and soft-dirty bits on the PTE.
> >>
> >> Another thing to note, is that when mremap moves PTEs they are marked with
> >> soft-dirty as well, since from the user perspective mremap modifies the
> >> virtual memory at mremap's new address.
> >>
> >> ...
> >>
> >> +config MEM_SOFT_DIRTY
> >> + bool "Track memory changes"
> >> + depends on CHECKPOINT_RESTORE && X86
> >
> > I guess we can add the CHECKPOINT_RESTORE dependency for now, but it is
> > a general facility and I expect others will want to get their hands on
> > it for unrelated things.
>
> OK. Just tell me when you need the dependency removing patch.
>
> >>From that perspective, the dependency on X86 is awful. What's the
> > problem here and what do other architectures need to do to be able to
> > support the feature?
>
> The problem here is that I don't know what free bits are available on
> page table entries on other architectures. I was about to resolve this
> for ARM very soon, but for the rest of them I need help from other people.

Well, this is also a thing arch maintainers can do when they feel a
need to support the feature on their architecture. To support them at
that time we should provide them with a) adequate information in an
easy-to-find place (eg, a nice comment at the site of the reference x86
implementation) and b) a userspace test app.

> > You have a test application, I assume. It would be helpful if we could
> > get that into tools/testing/selftests.
>
> If a very stupid 10-lines test is OK, then I can cook a patch with it.

I think that would be good. As a low-priority thing, please.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/