On Thu, Nov 22, 2018 at 12:04 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
On Thu, Nov 22, 2018 at 09:27:02PM +0200, Igor Stoppa wrote:
I have studied the code involved with Nadav's patchset.
I am perplexed about these sentences you wrote.
More to the point (to the best of my understanding):
poking_init()
-------------
1. it gets one random poking address and ensures to have at least 2
consecutive PTEs from the same PMD
2. it then proceeds to map/unmap an address from the first of the 2
consecutive PTEs, so that, later on, there will be no need to
allocate pages, which might fail, if poking from atomic context.
3. at this point, the page tables are populated, for the address that
was obtained at point 1, and this is ok, because the address is fixed
write_rare
----------
4. it can happen on any available core / thread at any time, therefore
each of them needs a different address
No? Each CPU has its own CR3 (eg each CPU might be running a different
user task). If you have _one_ address for each allocation, it may or
may not be mapped on other CPUs at the same time -- you simply don't care.
The writable address can even be a simple formula to calculate from
the read-only address, you don't have to allocate an address in the
writable mapping space.
Agreed. I suggest the formula:
writable_address = readable_address - rare_write_offset. For
starters, rare_write_offset can just be a constant. If we want to get
fancy later on, it can be randomized.
If we do it like this, then we don't need to modify any pagetables at
all when we do a rare write. Instead we can set up the mapping at
boot or when we allocate the rare write space, and the actual rare
write code can just switch mms and do the write.