Re: xen_exit_mmap() questions
From: Andy Lutomirski
Date: Thu Apr 27 2017 - 12:46:48 EST
On Thu, Apr 27, 2017 at 6:21 AM, Boris Ostrovsky
<boris.ostrovsky@xxxxxxxxxx> wrote:
>
>>>>>
>>>>>> Also, this code in drop_other_mm_ref() looks dubious to me:
>>>>>>
>>>>>> /* If this cpu still has a stale cr3 reference, then make sure
>>>>>> it has been flushed. */
>>>>>> if (this_cpu_read(xen_current_cr3) == __pa(mm->pgd))
>>>>>> load_cr3(swapper_pg_dir);
>>>>>>
>>>>>> If cr3 hasn't been flushed to the hypervisor because we're in a lazy
>>>>>> mode, why would load_cr3() help? Shouldn't this be xen_mc_flush()
>>>>>> instead?
>>>>>
>>>>> load_cr3() actually ends with xen_mc_flush() by way of xen_write_cr3()
>>>>> -> xen_mc_issue().
>>>>
>>>> xen_mc_issue() does:
>>>>
>>>> if ((paravirt_get_lazy_mode() & mode) == 0)
>>>> xen_mc_flush();
>>>>
>>>> I assume the load_cr3() is intended to deal with the case where we're
>>>> in lazy mode, but we'll still be in lazy mode, right? Or does it
>>>> serve some other purpose?
>>>
>>> Of course. I can't read (I ignored the "== 0" part).
>>>
>>> Apparently the early version had an explicit flush but then it disappeared
>>> (commit 9f79991d4186089e228274196413572cc000143b).
>>>
>>> The point of CR3 loading here, I believe, is to make sure the hypervisor
>>> knows that the (v)CPU is no longer using the the mm's cr3 (we are loading
>>> swapper_pgdir here).
>> But that's what leave_mm() does. To be fair, the x86 lazy TLB
>> management is a big mess, and this came up because I'm trying to clean
>> it up without removing it.
>
> True. I don't know though if you can guarantee that leave_mm() (or
> load_cr3() inside it) is actually called if we are in lazy mode.
The code just before that makes these calls.
Anyway, I propose to rewrite the whole thing like this:
https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/tlbflush_cleanup&id=ff143a54bb3bafaaad6e32145a9cfbc112e8584f
>
>>
>> I suppose I can try to keep xen_exit_mmap() working. Is there a
>> simple way to try to unpin but to not treat it as an error if the
>> hypervisor rejects it?
>
> Even if we managed to craft a call in Linux to do this (current
> xen_pgd_unpin() will result in a WARNing in xen_mc_flush()) this will
> still cause a bunch of warnings in the hypervisor (if it is built as
> DEBUG, but bad nevertheless).
>
> But even without that, it is an error for a reason so how are you
> planning to continue if you ignore it?
>
I was imagining that we'd just try to unpin and carry on if it fails.
We can always unpin later in xen_pgd_free().
--Andy