On 9/17/20 6:52 AM, kan.liang@xxxxxxxxxxxxxxx wrote:
+ mm = current->mm;
+ if (!mm) {
+ /*
+ * For kernel threads and the like, use init_mm so that
+ * we can find kernel memory.
+ */
+ mm = &init_mm;
+ }
I think it might be better to use current->active_mm instead of
current->mm. Kernel threads can "take over" the mm of the threads that
switched to them, so they're not actually using all of the page tables
from the init_mm all the time.
It's not _common_, thought. The only time that I think they can diverge
is when vmalloc PGD sync'ing needs to be done, and there's even an
effort to remove that.
But, it's probably more more precise to just use ->active_mm since
that's what will actually be more consistent with the values loaded into
CR3.
I _think_ ->active_mm is always non-NULL, too.
One last concern as I look at this: I wish it was a bit more
future-proof. There are lots of weird things folks are trying to do
with the page tables, like Address Space Isolation. For instance, if
you get a perf NMI when running userspace, current->mm->pgd is
*different* than the PGD that was in use when userspace was running.
It's close enough today, but it might not stay that way. But I can't
think of any great ways to future proof this code, other than spitting
out an error message if too many of the page table walks fail when they
shouldn't.