Re: [PATCH v3] x86: svm: use kvm_fast_pio_in()

From: Paolo Bonzini
Date: Tue Apr 07 2015 - 08:56:07 EST




On 03/03/2015 21:42, Radim KrÄmÃÅ wrote:
> 2015-03-03 13:48-0600, Joel Schopp:
>>>> + unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
>>> Shouldn't we handle writes in EAX differently than in AX and AL, because
>>> of implicit zero extension.
>> I don't think the implicit zero extension hurts us here, but maybe there
>> is something I'm missing that I need understand. Could you explain this
>> further?
>
> According to APM vol.2, 2.5.3 Operands and Results, when using EAX,
> we should zero upper 32 bits of RAX:
>
> Zero Extension of Results. In 64-bit mode, when performing 32-bit
> operations with a GPR destination, the processor zero-extends the 32-bit
> result into the full 64-bit destination. Both 8-bit and 16-bit
> operations on GPRs preserve all unwritten upper bits of the destination
> GPR. This is consistent with legacy 16-bit and 32-bit semantics for
> partial-width results.
>
> Is IN not covered?

It is. You need to zero the upper 32 bits.

>>>> + BUG_ON(!vcpu->arch.pio.count);
>>>> + BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax));
>>> (Looking at it again, a check for 'vcpu->arch.pio.count == 1' would be
>>> sufficient.)
>> I prefer the checks that are there now after your last review,
>> especially since surrounded by BUG_ON they only run on debug kernels.
>
> BUG_ON is checked on essentially all kernels that run KVM.
> (All distribution-based configs should have it.)

Correct.

> If we wanted to validate the size, then this is strictly better:
> BUG_ON(vcpu->arch.pio.count != 1 || vcpu->arch.pio.size > sizeof(new_rax))

That would be a very weird assertion considering that
vcpu->arch.pio.size will architecturally be at most 4.

The first arm of the || is sufficient.

>>>> + memcpy(&new_rax, vcpu, sizeof(new_rax));
>>>> + trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size,
>>>> + vcpu->arch.pio.count, vcpu->arch.pio_data);
>>>> + kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
>>>> + vcpu->arch.pio.count = 0;
>>> I think it is better to call emulator_pio_in_emulated directly, like
>>>
>>> emulator_pio_in_out(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size,
>>> vcpu->arch.pio.port, &new_rax, 1);
>>> kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax);
>>>
>>> because we know that vcpu->arch.pio.count != 0.
>
> Pasting the same code creates bug opportunities when we forget to modify
> all places. This class of problems can be harder to deal with, that (c)
> and (d), because we can't simply print all callers.

I agree with this and prefer calling emulator_pio_in_emulated in
complete_fast_pio_in, indeed.

>>> Refactoring could avoid the weird vcpu->ctxt->vcpu conversion.
>>> (A better name is always welcome.)

No need for that.

>> The pointer chasing is making me dizzy. I'm not sure why
>> emulator_pio_in_emulated takes a x86_emulate_ctxt when all it does it
>> immediately translate that to a vcpu and never use the x86_emulate_ctxt,
>> why not pass the vcpu in the first place?

Because the emulator is written to be usable outside the Linux kernel as
well.

Also, the fast path (used if kernel_pio returns 0) doesn't read
VCPU_REGS_RAX, thus using an uninitialized variable here:

>>> + unsigned long val;
>>> + int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size,
>>> + port, &val, 1);
>>> +
>>> + if (ret)
>>> + kvm_register_write(vcpu, VCPU_REGS_RAX, val);

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/