On Sat, Aug 12, 2017 at 11:56 PM, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
On 11-08-17 23:23, Arnd Bergmann wrote:
On Fri, Aug 11, 2017 at 3:23 PM, Hans de Goede <hdegoede@xxxxxxxxxx>
wrote:
Most of the issues I would normally comment on are already moot based
on the assumption that we won't be able to make substantial changes to
either the user space portion or the hypervisor side.
+/**
+ * Inflate the balloon by one chunk.
+ *
+ * The caller owns the balloon mutex.
+ *
+ * @returns VBox status code
+ * @param gdev The Guest extension device.
+ * @param chunk_idx Index of the chunk.
+ */
+static int vbg_balloon_inflate(PVBOXGUESTDEVEXT gdev, u32 chunk_idx)
+{
+ VMMDevChangeMemBalloon *req = gdev->mem_balloon.change_req;
+ struct page **pages;
+ int i, rc;
+
+ pages = kmalloc(sizeof(*pages) *
VMMDEV_MEMORY_BALLOON_CHUNK_PAGES,
+ GFP_KERNEL | __GFP_NOWARN);
+ if (!pages)
+ return VERR_NO_MEMORY;
+
+ req->header.size = sizeof(*req);
+ req->inflate = true;
+ req->pages = VMMDEV_MEMORY_BALLOON_CHUNK_PAGES;
The balloon section seems to be rather simplistic with its ioctl
interface.
Ideally this should use CONFIG_BALLOON_COMPACTION and an
oom notifier like the virtio_balloon driver does. Yes, I realize that only
one of the six (or more) balloon drivers we have in the kernel does it ;-)
The way the balloon code works is that the baloon-size is fully controlled
by the host, all the guest-additions do is wait for an event indicating that
the host wants to change it and then call this ioctl which will get the
desired balloon size from the host and tries to adjust the size accordingly.
Hooking into the oom killer is not really useful because there is no way we
can indicate to the host we are running low on mem and I don't think that
trying to shrink the balloon to be smaller then the host requested size
is a good idea.
Are you sure the guest can't just claim the memory back by using it?
Usually that is how the balloon drivers work: the host asks the guest
to free up some memory, and the guest frees up memory that it
tells to the host, but if the guest runs out of memory, it just starts using
it again and the host faults in empty pages.
For CONFIG_BALLOON_COMPACTION, you don't even need that,
you just put some memory into the balloon before you take some
other page out that is required for compaction. This might be less
of an issue here when the balloon always operates on 1MB chunks.
+ */
+ if (cbData <= sizeof(au64Buf) &&
+ VBOXGUEST_IOCTL_STRIP_SIZE(uCmd) !=
+ VBOXGUEST_IOCTL_STRIP_SIZE(VBOXGUEST_IOCTL_VMMREQUEST(0))) {
+ pvBufFree = NULL;
+ pvBuf = &au64Buf[0];
+ } else {
+ /* __GFP_DMA32 for VBOXGUEST_IOCTL_VMMREQUEST */
+ pvBufFree = pvBuf = kmalloc(cbData, GFP_KERNEL |
__GFP_DMA32);
+ if (!pvBuf)
+ return -ENOMEM;
+ }
+ if (copy_from_user(pvBuf, (void *)ulArg, cbData) == 0) {
I'd also change the commands
to not always do both read and write, but only whichever applies. This
function
would then do the copy_from_user/copy_to_user conditionally.
This is actually an upstream TODO item I believe. One which would
probably be best fixed by using _IOR/_IOW/_IORW IOCTL macros, but
that needs to be coordinated with upstream.
I don't see the difference between changing the command codes andIt would be good to clean this up and always use the same structure here.
The HGCMFunctionParameter struct describes function-parameters
in a Host-Guest-Control-Method function call, so this is part of the
host ABI interface and we cannot change this.
Any 32 bit apps running on a 64 bit kernel will be using the 32 bit
version of this struct and we need to translate. >
changing the structure layout here. In both cases, that is an incompatible
ABI change but shouldn't much impact the source-level ABI if you do
it right.
+ * The layout of VMMDEV RAM region that contains information for guest.
+ */
+typedef struct VMMDevMemory {
+ /** The size of this structure. */
+ u32 u32Size;
+ /** The structure version. (VMMDEV_MEMORY_VERSION) */
+ u32 u32Version;
+
+ union {
+ struct {
+ /** Flag telling that VMMDev has events pending.
*/
+ bool fHaveEvents;
+ } V1_04;
+
As this is a hardware interface, maybe use u32 instead of 'bool'.
I guess that should work here, since we only every do "if (fHaveEvents)"
but in other cases where we also set bool variables in the hardware
interface structs we also need to know which bit(s) get set on true
to move this over to non bool use.
I agree that the choice of bool in a (virtual) hardware interface
is unfortunate, but we should not be trying to change the (virtual)
hardware definition IMHO.
According to the x86 psABI (https://www.uclibc.org/docs/psABI-x86_64.pdf),
a _Bool should be represented in memory like an 'unsigned char' but only
contain the values 0 or 1 (the upper 7 bits are zero).
Before linux-2.6.18, we defined 'bool' as 'typedef enum { false, true}
__packed bool;', which would make it a 32-bit integer that can take the
values 0 and 1.
Inside of the union, the two have the same binary layout
on little-endian architectures, as the information is still stored in
the low bit of the first of four bytes.
I think if you do
union {
struct {
/** Flag telling that VMMDev has events pending. */
u8 fHaveEvents;
u8 pad[3];
} V1_04;
that will be a portable representation if the existing ABI.
+#define VBOXGUEST_IOCTL_CODE(function, size) \
+ VBOXGUEST_IOCTL_CODE_((function) | VBOXGUEST_IOCTL_FLAG, size)
+/* Define 32 bit codes to support 32 bit applications in 64 bit guest
driver. */
+#define VBOXGUEST_IOCTL_CODE_32(function, size) \
+VBOXGUEST_IOCTL_CODE_(function, size)
If the command numbers can be changed wihtout causing too many
problems, I'd just do away with these wrappers and use _IOR/_IOW/_IORW
as needed.
The upstream vboxguest.h header defines VBOXGUEST_IOCTL_CODE for many
different guest platforms. I guess not all of them have the
_IOR/_IOW/_IORW flags embedded in the ioctl-code concept and I believe
some of them only have a small amount of bits which can actually be
used for the code, so we cannot just cram these flags in there.
Do you have a specific example? I think the BSDs and derived operating
systems are actually much stricter with those flags than Linux is, and the
macros I suggested originate in BSD.
I think that instead we need an ioctl_direction_flags table which can
be used to determine if we need todo the copy_from_user and friends,
without changing the codes.
I would not bother with that. Either we fix the command codes to correctly
reflect the direction, or we should actually copy the data both ways as the
current implementation does.