Re: [PATCH v2 4/5] mm: Add new ptep_deref() helper to fully encapsulate pte_t

From: Yu Zhao
Date: Thu May 25 2023 - 22:03:19 EST


On Fri, May 19, 2023 at 3:12 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
>
> On 18/05/2023 20:28, Yu Zhao wrote:
> > On Thu, May 18, 2023 at 5:07 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
> >>
> >> There are many call sites that directly dereference a pte_t pointer.
> >> This makes it very difficult to properly encapsulate a page table in the
> >> arch code without having to allocate shadow page tables. ptep_deref()
> >> aims to solve this by replacing all direct dereferences with a call to
> >> this function.
> >>
> >> The default implementation continues to just dereference the pointer
> >> (*ptep), so generated code should be exactly the same. However, it is
> >> possible for the architecture to override the default with their own
> >> implementation, that can (e.g.) hide certain bits from the core code, or
> >> determine young/dirty status by mixing in state from another source.
> >>
> >> While ptep_get() and ptep_get_lockless() already exist, these are
> >> implemented as atomic accesses (e.g. READ_ONCE() in the default case).
> >> So rather than using ptep_get() and risking performance regressions,
> >> introduce an new variant.
> >
> > We should reuse ptep_get():
> > 1. I don't think READ_ONCE() can cause measurable regressions in this case.
> > 2. It's technically wrong without it.
>
> Can you clarify what you mean by technically wrong? Are you saying that the
> current code that does direct dereferencing is buggy?

Sorry for not being clear.

I think we can agree that *ptep is volatile. Not being treated as such
seems a bad idea to me. I don't think it'd cause any real problems --
most warnings KCSAN reported didn't either, but we fixed them anyway.
So should we fix this case as well while we are at it.

> I previously convinced myself that the potential for the compiler generating
> multiple loads was safe because the code in question is under the PTL so there
> are no concurrent stores. And we shouldn't see any tearing for the same reason.
>
> That said, if there is concensus that we can just use ptep_get() (==
> READ_ONCE()) everywhere, then I agree that would be cleaner. Does anyone object?

(No objection to NOT using it either. Just a recommendation, since
it's already there.)