Re: [PATCH] mm: Add new vma flag VM_LOCAL_CPU

From: Boaz Harrosh
Date: Tue May 22 2018 - 11:12:07 EST


On 18/05/18 17:14, Christopher Lameter wrote:
> On Tue, 15 May 2018, Boaz Harrosh wrote:
>
>>> I don't think page tables work the way you think they work.
>>>
>>> + err = vm_insert_pfn_prot(zt->vma, zt_addr, pfn, prot);
>>>
>>> That doesn't just insert it into the local CPU's page table. Any CPU
>>> which directly accesses or even prefetches that address will also get
>>> the translation into its cache.
>>>
>>
>> Yes I know, but that is exactly the point of this flag. I know that this
>> address is only ever accessed from a single core. Because it is an mmap (vma)
>> of an O_TMPFILE-exclusive file created in a core-pinned thread and I allow
>> only that thread any kind of access to this vma. Both the filehandle and the
>> mmaped pointer are kept on the thread stack and have no access from outside.
>>
>> So the all point of this flag is the kernel driver telling mm that this
>> address is enforced to only be accessed from one core-pinned thread.
>
> But there are no provisions for probhiting accesses from other cores?
>
> This means that a casual accidental write from a thread executing on
> another core can lead to arbitrary memory corruption because the cache
> flushing has been bypassed.
>

No this is not accurate. A "casual accidental write" will not do any harm.
Only a well concerted malicious server can exploit this. A different thread
on a different core will need to hit the exact time to read from the exact
pointer at the narrow window while the IO is going on. fault-in a TLB at the
time of the valid mapping. Then later after the IO has ended and before any
of the threads where scheduled out, maliciously write. All the while the App
has freed its buffers and the buffer was used for something else.
Please bear in mind that this is only As root, in an /sbin/ executable signed
by the Kernel's key. I think that anyone who as gained such an access to the
system (i.e compiled and installed an /sbin server), Can just walk the front door.
He does not need to exploit this narrow random hole. Hell he can easily just
modprob a Kernel module.

And I do not understand. Every one is motivated in saying "no cannot be solved"
So lets start from the Beginning.

How can we implement "Private memory"?

You know how in the fork days. We have APIs for "shared memory".

I.E: All read/write memory defaults to private except special setup
"shared memory"
This is vs Threads where all memory regions are shared.

[Q] How can we implement a "private memory" region.
.I.E All read/write memory defaults to shared except special setup
"private memory"

Can this be done? How, please advise?

Thanks
Boaz