Re: [RESEND PATCH v1 1/5] mm: vmalloc must set pte via arch code

From: Ryan Roberts
Date: Mon May 15 2023 - 04:32:54 EST

Hi Lorenzo,

Thanks for the review - I appreciate it!

On 13/05/2023 14:14, Lorenzo Stoakes wrote:
> You've not cc'd the vmalloc reviewers, including the author of 3e9a9e256b1e
> whose patch you purport to fix. Please remember to run
> on all files you patch and cc them at least on relevant patches.
> Have added Christoph + Uladzislau as cc.

I did run, but it gave me 82 names. I assumed I wouldn't be
making any friends by CCing everyone, so tried to choose what I thought was a
sensible base. I guess I didn't quite get it right. Sorry about that. Thanks for
noticing and adding the right people.

> You'll definitely want an ack from Christoph on this!
> On Thu, May 11, 2023 at 02:21:09PM +0100, Ryan Roberts wrote:
>> It is bad practice to directly set pte entries within a pte table.
>> Instead all modifications must go through arch-provided helpers such as
>> set_pte_at() to give the arch code visibility and allow it to validate
>> (and potentially modify) the operation.
> This does make sense, and I see for example in xtensa that an arch-specific
> instruction is issued under certain circumstances so I do suspect we should
> do this.

arm64 provides another example, where barriers are required to ensure the page
table walker sees the new pte and no fault is raised. See
arch/arm64/include/asm/pgtable.h:set_pte() (which is called by its
implementation of set_pte_at()).

> As for validation, the function never indicates an error, so only in the
> sense that a WARN_ON() could _in theory_ trigger is it being
> validated. This might be quite a nitty point :) as set_pte_at() has no
> means of indicating an error. But maybe to be pedantic 'check' rather than
> 'validate'?

I'm sorry, I'm not sure what you are asking here? set_pte_at() forms part of the
contract with he arch code and is defined never to return an error. Some
implementations might have code enabled in debug configs to detect incorrect
usage and emit warnings (see arm64's implementation).

>> Fixes: 3e9a9e256b1e ("mm: add a vmap_pfn function")
> Not sure if this is really 'fixing' anything, I mean ostensibly, but not
> sure if the tag is relevant here, that is more so for a bug being
> introduced, and unless an issue has arisen not sure if it's
> appropriate. But this might be a nit, again!

Well I'm happy to remove it if that's the concensus. But I do believe there is a
real bug here. At least on arm64, the barriers are needed to prevent a race with
the page table walker. That said, the only place in the tree I can see
vmap_pfn() used, is in the i915 driver, which I guess has never been used on an
arm64 platform.

>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
>> ---
>> mm/vmalloc.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index 9683573f1225..d8d2fe797c55 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -2899,10 +2899,13 @@ struct vmap_pfn_data {
>> static int vmap_pfn_apply(pte_t *pte, unsigned long addr, void *private)
>> {
>> struct vmap_pfn_data *data = private;
>> + pte_t ptent;
>> if (WARN_ON_ONCE(pfn_valid(data->pfns[data->idx])))
>> return -EINVAL;
>> - *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
>> +
>> + ptent = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
>> + set_pte_at(&init_mm, addr, pte, ptent);>
> While we're refactoring, it'd be nice to stash data->pfns[data->idx] into a
> local pfn variable.

OK, I'll do this for v2.


>> return 0;
>> }
>> --
>> 2.25.1