Re: [PATCH 10/17] prmem: documentation

From: Andy Lutomirski
Date: Thu Nov 22 2018 - 15:54:13 EST


On Thu, Nov 22, 2018 at 12:04 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Thu, Nov 22, 2018 at 09:27:02PM +0200, Igor Stoppa wrote:
> > I have studied the code involved with Nadav's patchset.
> > I am perplexed about these sentences you wrote.
> >
> > More to the point (to the best of my understanding):
> >
> > poking_init()
> > -------------
> > 1. it gets one random poking address and ensures to have at least 2
> > consecutive PTEs from the same PMD
> > 2. it then proceeds to map/unmap an address from the first of the 2
> > consecutive PTEs, so that, later on, there will be no need to
> > allocate pages, which might fail, if poking from atomic context.
> > 3. at this point, the page tables are populated, for the address that
> > was obtained at point 1, and this is ok, because the address is fixed
> >
> > write_rare
> > ----------
> > 4. it can happen on any available core / thread at any time, therefore
> > each of them needs a different address
>
> No? Each CPU has its own CR3 (eg each CPU might be running a different
> user task). If you have _one_ address for each allocation, it may or
> may not be mapped on other CPUs at the same time -- you simply don't care.
>
> The writable address can even be a simple formula to calculate from
> the read-only address, you don't have to allocate an address in the
> writable mapping space.
>

Agreed. I suggest the formula:

writable_address = readable_address - rare_write_offset. For
starters, rare_write_offset can just be a constant. If we want to get
fancy later on, it can be randomized.

If we do it like this, then we don't need to modify any pagetables at
all when we do a rare write. Instead we can set up the mapping at
boot or when we allocate the rare write space, and the actual rare
write code can just switch mms and do the write.