Re: [kvmtool test PATCH 22/24] kvmtool: arm64: Add support for guest physical address size

From: Suzuki K Poulose
Date: Thu Jul 05 2018 - 10:12:59 EST


On 05/07/18 14:46, Auger Eric wrote:
Hi Marc,

On 07/05/2018 03:20 PM, Marc Zyngier wrote:
On 05/07/18 13:47, Julien Grall wrote:
Hi Will,

On 04/07/18 16:52, Will Deacon wrote:
On Wed, Jul 04, 2018 at 04:00:11PM +0100, Julien Grall wrote:
On 04/07/18 15:09, Will Deacon wrote:
On Fri, Jun 29, 2018 at 12:15:42PM +0100, Suzuki K Poulose wrote:
Add an option to specify the physical address size used by this
VM.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
---
arm/aarch64/include/kvm/kvm-config-arch.h | 5 ++++-
arm/include/arm-common/kvm-config-arch.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h b/arm/aarch64/include/kvm/kvm-config-arch.h
index 04be43d..dabd22c 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -8,7 +8,10 @@
"Create PMUv3 device"), \
OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed, \
"Specify random seed for Kernel Address Space " \
- "Layout Randomization (KASLR)"),
+ "Layout Randomization (KASLR)"), \
+ OPT_INTEGER('\0', "phys-shift", &(cfg)->phys_shift, \
+ "Specify maximum physical address size (not " \
+ "the amount of memory)"),

Given that this is a shift value, I think the help message could be more
informative. Something like:

"Specify maximum number of bits in a guest physical address"

I think I'd actually leave out any mention of memory, because this does
actually have an effect on the amount of addressable memory in a way that I
don't think we want to describe in half of a usage message line :)
Is there any particular reasons to expose this option to the user?

I have recently sent a series to allow the user to specify the position
of the RAM [1]. With that series in mind, I think the user would not really
need to specify the maximum physical shift. Instead we could automatically
find it.

Marc makes a good point that it doesn't help for MMIO regions, so I'm trying
to understand whether we can do something differently there and avoid
sacrificing the type parameter.

I am not sure to understand this. kvmtools knows the memory layout
(including MMIOs) of the guest, so couldn't it guess the maximum
physical shift for that?

That's exactly what Will was trying to avoid, by having KVM to compute
the size of the IPA space based on the registered memslots. We've now
established that it doesn't work, so what we need to define is:

- whether we need another ioctl(), or do we carry on piggy-backing on
the CPU type,
kvm type I guess

machine type is more appropriate, going by the existing users.

- assuming the latter, whether we can reduce the number of bits used in
the ioctl parameter by subtly encoding the IPA size.
Getting benefit from your Freudian slip, how should guest CPU PARange
and maximum number of bits in a guest physical address relate?

My understanding is they are not correlated at the moment and our guest
PARange is fixed at the moment. But shouldn't they?

On Intel there is
qemu-system-x86_64 -M pc,accel=kvm -cpu SandyBridge,phys-bits=36
or
qemu-system-x86_64 -M pc,accel=kvm -cpu SandyBridge,host-phys-bits=true

where phys-bits, as far as I understand has a a similar semantics as the
PARange.


AFAIT, PARange tells you the maximum (I)Physcial Address that can be handled
by the CPU. But your IPA limit tells you where the guest RAM is placed.
So they need not be the same. e.g, on Juno, A57's have a PARange of 42 if I am
not wrong (but definitely > 40), while A53's have it at 40 and the system RAM
is at 40bits.

So, if we were to only use the A57s on Juno, we could run a KVM instance with 42
bits IPA or anything lower. So, PARange can be inferred as the maximum limit
of the CPU's capability while the IPA is where the RAM is placed for a given
system.
One could keep them in sync for a VM by emulating, but then nobody
uses the PARange, except the KVM. The other problem with capping PARange in the VM
to IPA is restricting the IPA size of a nested VM. So, I don't think this is
really beneficial.

Cheers
Suzuki



Thanks

Eric

Thanks,

M.