Re: [PATCH V7 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE

From: Dave Hansen
Date: Thu Sep 17 2020 - 15:01:42 EST


On 9/17/20 6:52 AM, kan.liang@xxxxxxxxxxxxxxx wrote:
> + mm = current->mm;
> + if (!mm) {
> + /*
> + * For kernel threads and the like, use init_mm so that
> + * we can find kernel memory.
> + */
> + mm = &init_mm;
> + }

I think it might be better to use current->active_mm instead of
current->mm. Kernel threads can "take over" the mm of the threads that
switched to them, so they're not actually using all of the page tables
from the init_mm all the time.

It's not _common_, thought. The only time that I think they can diverge
is when vmalloc PGD sync'ing needs to be done, and there's even an
effort to remove that.

But, it's probably more more precise to just use ->active_mm since
that's what will actually be more consistent with the values loaded into
CR3.

I _think_ ->active_mm is always non-NULL, too.

One last concern as I look at this: I wish it was a bit more
future-proof. There are lots of weird things folks are trying to do
with the page tables, like Address Space Isolation. For instance, if
you get a perf NMI when running userspace, current->mm->pgd is
*different* than the PGD that was in use when userspace was running.
It's close enough today, but it might not stay that way. But I can't
think of any great ways to future proof this code, other than spitting
out an error message if too many of the page table walks fail when they
shouldn't.