Re: [Xen-devel] [PATCH] xen: add steal_clock support on x86

From: Tony S
Date: Wed May 18 2016 - 14:00:12 EST


On Wed, May 18, 2016 at 11:20 AM, Boris Ostrovsky
<boris.ostrovsky@xxxxxxxxxx> wrote:
> On 05/18/2016 12:10 PM, Dario Faggioli wrote:
>> On Wed, 2016-05-18 at 16:53 +0200, Juergen Gross wrote:
>>> On 18/05/16 16:46, Boris Ostrovsky wrote:
>>>>
>>>> Won't we be accounting for stolen cycles twice now --- once from
>>>> steal_account_process_tick()->steal_clock() and second time from
>>>> do_stolen_accounting()?
>>> Uuh, yes.
>>>
>>> I guess I should rip do_stolen_accounting() out, too? It is a
>>> Xen-specific hack, so I guess nobody will cry. Maybe it would be a
>>> good idea to select CONFIG_PARAVIRT_TIME_ACCOUNTING for XEN then?
>>>
>> So, config options aside, if I understand this correctly, it looks like
>> we were actually already doing steal time accounting, although in a
>> non-standard way.
>>
>> And yet, people seem to have issues relating to lack of (proper?) steal
>> time accounting (Cc-ing Tony).
>>
>> I guess this means that, either:
>> - the issue being reported is actually not caused by the lack of
>> steal time accounting,
>> - our current (Xen specific) steal time accounting solution is flawed,
>> - the issue is caused by the lack of the bit of steal time accounting
>> that we do not support yet,
>
> I believe it's this one.
>
> Tony narrowed the problem down to update_curr() where vruntime is
> calculated, based on runqueue's clock_task value. That value is computed
> in update_rq_clock_task(), which needs paravirt_steal_rq_enabled.
>

Hi Boris,

You are right.

The real problem is steal_clock in pv_time_ops is implemented in KVM
but not in Xen.

arch/x86/include/asm/paravirt_types.h
struct pv_time_ops {
unsigned long long (*sched_clock)(void);
unsigned long long (*steal_clock)(int cpu);
unsigned long (*get_tsc_khz)(void);
};


(1) KVM implemented both sched_clock and steal_clock.

arch/x86/kernel/kvmclock.c
pv_time_ops.sched_clock = kvm_clock_read;

arch/x86/kernel/kvm.c
pv_time_ops.steal_clock = kvm_steal_clock;


(2) However, Xen just implemented sched_clock while the steal_clock is
still native_steal_clock(). The function native_steal_clock() just
simply return 0.

arch/x86/xen/time.c
.sched_clock = xen_clocksource_read;

arch/x86/kernel/paravirt.c
static u64 native_steal_clock(int cpu)
{
return 0;
}


Therefore, even though update_rq_clock_task() calculates the value and
paravirt_steal_rq_enabled option is enabled, the steal value just
returns 0. This will cause the problem which I mentioned.

update_rq_clock_task
--> paravirt_steal_clock
--> pv_time_ops.steal_clock
--> native_steal_clock (if in Xen)
--> 0

The fundamental solution is to implement a steal_clock in Xen(learn
from KVM implementation) instead of using the native one.

Tony

> -boris
>
>> - other ideas? Tony?
>>
>> Dario
>
>