Re: [PATCH v3] xen/balloon: add late_initcall_sync() for initial ballooning done

From: Juergen Gross
Date: Mon Nov 01 2021 - 03:21:15 EST


On 30.10.21 01:44, Boris Ostrovsky wrote:

On 10/29/21 6:18 PM, Marek Marczykowski-Górecki wrote:
On Fri, Oct 29, 2021 at 05:46:18PM -0400, Boris Ostrovsky wrote:
On 10/29/21 10:20 AM, Juergen Gross wrote:
--- a/Documentation/ABI/stable/sysfs-devices-system-xen_memory
+++ b/Documentation/ABI/stable/sysfs-devices-system-xen_memory
@@ -84,3 +84,13 @@ Description:
           Control scrubbing pages before returning them to Xen for others domains
           use. Can be set with xen_scrub_pages cmdline
           parameter. Default value controlled with CONFIG_XEN_SCRUB_PAGES_DEFAULT.
+
+What:        /sys/devices/system/xen_memory/xen_memory0/boot_timeout
+Date:        November 2021
+KernelVersion:    5.16
+Contact:    xen-devel@xxxxxxxxxxxxxxxxxxxx
+Description:
+        The time (in seconds) to wait before giving up to boot in case
+        initial ballooning fails to free enough memory. Applies only
+        when running as HVM or PVH guest and started with less memory
+        configured than allowed at max.

How is this going to be used? We only need this during boot.


-        state = update_schedule(state);
+        balloon_state = update_schedule(balloon_state);

Now that balloon_state has whole file scope it can probably be updated inside update_schedule().


+    while ((credit = current_credit()) < 0) {
+        if (credit != last_credit) {
+            last_changed = jiffies;
+            last_credit = credit;
+        }
+        if (balloon_state == BP_ECANCELED) {

What about other states? We are really waiting for BP_DONE, aren't we?
BP_DONE is set also as an intermediate step:

                        balloon_state = decrease_reservation(n_pages,
GFP_BALLOON);
                        if (balloon_state == BP_DONE && n_pages != -credit &&
                             n_pages < totalreserve_pages)
                                balloon_state = BP_EAGAIN;

It would be bad to finish waiting in this case.


RIght, but if we were to say 'if (balloon_state != BP_DONE)' the worst that can happen is that we will continue on to the next iteration without warning and/or panicing. Of course, there is a chance thaton the next iteration the same thing will happen but I think chances of hitting this race every time are infinitely low. We can also check for current_credit() again.


The question is whether we do want to continue waiting if we are in BP_AGAIN. I don't think BP_WAIT is possible in this case although this may change in the future and we will forget to update this code.

BP_EAGAIN should not stop waiting, as it might be intermediate in case
some caches or buffers are freed.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature