Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots
From: Paolo Bonzini
Date: Mon Sep 02 2013 - 11:56:27 EST
Il 02/09/2013 12:07, Gleb Natapov ha scritto:
> On Mon, Sep 02, 2013 at 06:00:39PM +0800, Xiao Guangrong wrote:
>> On 09/02/2013 05:25 PM, Gleb Natapov wrote:
>>> On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
>>>> On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
>>>>> Page tables in a read-only memory slot will currently cause a triple
>>>>> fault because the page walker uses gfn_to_hva and it fails on such a slot.
>>>>>
>>>>> OVMF uses such a page table; however, real hardware seems to be fine with
>>>>> that as long as the accessed/dirty bits are set. Save whether the slot
>>>>> is readonly, and later check it when updating the accessed and dirty bits.
>>>>
>>>> Paolo, do you know why OVMF is using readonly memory like this?
>>>>
>>> Just a guess, but perhaps they want to move to paging mode as early as
>>> possible even before memory controller is fully initialized.
>>>
>>>> AFAIK, The fault trigged by this kind of access can hardly be fixed by
>>>> userspace since the fault is trigged by pagetable walking not by the current
>>>> instruction. Do you have any idea to let uerspace emulate it properly?
>>> Not sure what userspace you mean here, but there shouldn't be a fault in the
>>
>> I just wonder how to fix this kind of fault. The current patch returns -EACCES
>> but that will crash the guest. I think we'd better let userspace to fix this
>> error (let userspace set the D/A bit.)
>>
> Ugh, this is not good. Missed that. Don't know what real HW will do
> here, but the easy thing for us to do would be to just return success.
Real hardware would just do a memory write. What happens depends on
what is on the bus, i.e. on what the ROM is used for.
QEMU uses read-only slots for two things: actual read-only memory where
writes go to the bitbucket, and "ROMD" memory where writes are treated
as MMIO.
So, in the first case we would ignore the write. In the second we would
do an MMIO exit to userspace. But ignoring the write isn't always
correct, and doing an MMIO exit is complicated, so I would just kill the
guest.
EPT will probably write to the read-only slots without thinking much
about it.
My patch injects a page fault, which is very likely to escalate to a
triple fault. This is probably never what you want---on the other hand,
I wasn't sure what level of accuracy we want in this case, given that
EPT does it wrong too.
Paolo
>>> first place if ROM page tables have access/dirty bit set and they do.
>>
>> Yes, so we can not call x86_emulate_instruction() to fix this fault (that function
>> emulates the access on the first place). Need directly return a MMIO-exit to
>> userpsace when met this fault? What happen if this fault on pagetable-walking
>> is trigged in x86_emulate_instruction().?
> I think we should not return MMIO-exit to userspace. Either ignore write attempt
> or kill a guest.
>
> --
> Gleb.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/