Re: [PATCH RFC 2/2] mm/x86/pat: Do proper PAT bit shift for large mappings
From: Peter Xu
Date: Fri May 24 2024 - 19:56:03 EST
On Thu, May 23, 2024 at 08:30:19PM -0700, Dave Hansen wrote:
> On 5/23/24 16:07, Peter Xu wrote:
> > Probably not.. I think I can define a pgprot_to_large() globally, pointing
> > that to pgprot_4k_2_large() on x86 and make the fallback to be noop. And
> > if there's a new version I'll guarantee to run over my cross compilers.
>
> I guess that would be functional, but it would be a bit mean to
> everybody else.
>
> > Any comments on the idea itself? Do we have a problem, or maybe I
> > overlooked something?
>
> I think it's probably unnecessary to inflict this particular x86-ism on
> generic code. The arch-generic 'prot' should have PAT at its 4k
> (_PAGE_BIT_PAT) position and then p*d_mkhuge() can shift it into the
> _PAGE_BIT_PAT_LARGE spot.
Right that's another option indeed.
It's just that I found it might in many cases be better when we have the
API separately properly and making the pairs matching each other.
For example, it could be clearer if pxx_mkhuge() does exactly what
pxx_leaf() would check against.
PS: I hoped it's called pxx_huge() already to make the name paired with
each other; afaict we called it pxx_leaf() only because pxx_huge() used to
be "abused" by hugetlbfs before.. now it's gone.
The other thing is we mostly only need these knobs for special maps like
pfnmaps, am I right? OTOH we use WB for RAMs, and maybe we don't want to
bother any PAT stuff when the kernel is installing a THP anonymous?
IMHO having pgprot_to_large() is fine even if only x86 has it; it's really
like pfn tracking itself which is noop for !x86. but I'll follow your
advise if you still insist; I don't really have a strong opinion.
But if so I'd also like to mention a 3rd option, which is to have
pxx_mkhuge_prot(), fallback to pxx_mkhuge() for !x86. That'll make
pxx_huge() untainted for x86. I'm not sure whether that would ease the
same concern, though.
In all cases, thanks for confirming this issue, I appreciate that. Let me
know if you have any comment on patch 1 too; that one isn't a problem so
far iiuc, but it can be soon.
Thanks,
--
Peter Xu