Re: your "x86: mm: convert dump_pagetables to use walk_page_range" change

From: Steven Price
Date: Tue May 12 2020 - 09:03:01 EST


On 12/05/2020 10:39, Jan Beulich wrote:
Steven,

Hi Jan,

in the description of this change you say:

"The effective permissions are passed down the chain using new fields in
struct pg_state."

I don't see how this works, and I suppose this part of the change is
(part of) the reason why a W+X warning has magically disappeared in
5.6.x (compared to 5.5.x) when running a 32-bit kernel under Xen.

Quoting the relevant piece of code:

if (level > 0) {
new_eff = effective_prot(st->prot_levels[level - 1],
new_prot);
} else {
new_eff = new_prot;
}

if (level >= 0)
st->prot_levels[level] = new_eff;

The generic framework calls note_page() only for leaf pages or holes
afaics. The protections for a leaf page found at a level other than
the numerically highest one have no meaning at all for a mapping at
a later address mapped with a numerically higher level mapping.
Instead it's the non-leaf page tables for that specific address
which determine the effective protection for any particular mapping.

To take an example, suppose the first present leaf page is found
at level 4. st->prot_levels[] will be all zero at this time, from
which it follows that new_eff will be zero then, too.

I don't think the intended effect can be achieved without either
retaining the original behavior of passing the effective protection
into note_page(), or calling note_page() also for non-leaf pages
(indicating to it which case it is, and adjusting it accordingly).

Am I overlooking something?

Sadly I don't think you are - you're reasoning seems correct. It looks like the computation of effective permissions will need to be done in ptdump.c rather than dump_pagetables.c - as it's only ptdump.c that deals with the non-leaf entries as you point out.

Additionally I'd like to note that note_page()'s "unsigned long val"
parameter isn't wide enough for 32-bit PAE PTEs, and hence the NX
flag will always be seen as clear in new_prot in such configs.

Ah, interesting. I'm not sure what type is actually guaranteed to be correct. pgprotval_t is x86 specific, but it might be necessary to extend it to other architectures. I think I got the "unsigned long" from the generic page.h (and because it happens to work on most architectures) - but hadn't noticed that that file was specifically only for NOMMU architectures.

I'll see if I can come up with fixes, but if you've got anything ready already then please jump in.

Steve