Re: [PATCH v2 5/7] KVM-GST: KVM Steal time accounting

From: Glauber Costa
Date: Sun Jun 19 2011 - 22:39:01 EST


On 06/19/2011 07:04 AM, Avi Kivity wrote:
On 06/17/2011 01:20 AM, Glauber Costa wrote:
This patch accounts steal time time in kernel/sched.
I kept it from last proposal, because I still see advantages
in it: Doing it here will give us easier access from scheduler
variables such as the cpu rq. The next patch shows an example of
usage for it.

Since functions like account_idle_time() can be called from
multiple places, not only account_process_tick(), steal time
grabbing is repeated in each account function separatedely.

/*
+ * We have to at flush steal time information every time something else
+ * is accounted. Since the accounting functions are all visible to
the rest
+ * of the kernel, it gets tricky to do them in one place. This helper
function
+ * helps us.
+ *
+ * When the system is idle, the concept of steal time does not apply.
We just
+ * tell the underlying hypervisor that we grabbed the data, but skip
steal time
+ * accounting
+ */
+static inline bool touch_steal_time(int is_idle)
+{
+ u64 steal, st = 0;
+
+ if (static_branch(&paravirt_steal_enabled)) {
+
+ steal = paravirt_steal_clock(smp_processor_id());
+
+ steal -= this_rq()->prev_steal_time;
+ if (is_idle) {
+ this_rq()->prev_steal_time += steal;
+ return false;
+ }
+
+ while (steal>= TICK_NSEC) {
+ /*
+ * Inline assembly required to prevent the compiler
+ * optimising this loop into a divmod call.
+ * See __iter_div_u64_rem() for another example of this.
+ */

Why not use said function?

because here we want to do work during each loop. The said function
would have to be adapted for that, possibly using a macro, to run arbitrary code during each loop iteration, in a way that I don't think
it is worthy given the current number of callers (2 counting this new one)

+ asm("" : "+rm" (steal));
+
+ steal -= TICK_NSEC;
+ this_rq()->prev_steal_time += TICK_NSEC;
+ st++;

Suppose a live migration or SIGSTOP causes lots of steal time. How long
will we spend here?
Silly me. I actually used this same argument with Peter to cap it with "delta" in the next patch in this series. So I think you are 100 % right. Here, however, we do want to account all that time, I believe.

How about we do a slow division if we're > 10 sec (unlikely), and account everything as steal time in this scenario ?

+ }
+
+ account_steal_time(st);
+ return !!st;

!! !needed, you're returning a bool.

ah, sure thing.

+ }
+ return false;
+}
+

I'll need Peter's (or another sched maintainer's) review to apply this.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/