Re: [Xen-devel] [PATCH] xen: add steal_clock support on x86
From: Tony S
Date: Wed May 18 2016 - 14:04:16 EST
On Wed, May 18, 2016 at 11:59 AM, Tony S <suokunstar@xxxxxxxxx> wrote:
> On Wed, May 18, 2016 at 11:20 AM, Boris Ostrovsky
> <boris.ostrovsky@xxxxxxxxxx> wrote:
>> On 05/18/2016 12:10 PM, Dario Faggioli wrote:
>>> On Wed, 2016-05-18 at 16:53 +0200, Juergen Gross wrote:
>>>> On 18/05/16 16:46, Boris Ostrovsky wrote:
>>>>>
>>>>> Won't we be accounting for stolen cycles twice now --- once from
>>>>> steal_account_process_tick()->steal_clock() and second time from
>>>>> do_stolen_accounting()?
>>>> Uuh, yes.
>>>>
>>>> I guess I should rip do_stolen_accounting() out, too? It is a
>>>> Xen-specific hack, so I guess nobody will cry. Maybe it would be a
>>>> good idea to select CONFIG_PARAVIRT_TIME_ACCOUNTING for XEN then?
>>>>
>>> So, config options aside, if I understand this correctly, it looks like
>>> we were actually already doing steal time accounting, although in a
>>> non-standard way.
>>>
>>> And yet, people seem to have issues relating to lack of (proper?) steal
>>> time accounting (Cc-ing Tony).
>>>
>>> I guess this means that, either:
>>> - the issue being reported is actually not caused by the lack of
>>> steal time accounting,
>>> - our current (Xen specific) steal time accounting solution is flawed,
>>> - the issue is caused by the lack of the bit of steal time accounting
>>> that we do not support yet,
>>
>> I believe it's this one.
>>
>> Tony narrowed the problem down to update_curr() where vruntime is
>> calculated, based on runqueue's clock_task value. That value is computed
>> in update_rq_clock_task(), which needs paravirt_steal_rq_enabled.
>>
>
> Hi Boris,
>
> You are right.
>
> The real problem is steal_clock in pv_time_ops is implemented in KVM
> but not in Xen.
>
> arch/x86/include/asm/paravirt_types.h
> struct pv_time_ops {
> unsigned long long (*sched_clock)(void);
> unsigned long long (*steal_clock)(int cpu);
> unsigned long (*get_tsc_khz)(void);
> };
>
>
> (1) KVM implemented both sched_clock and steal_clock.
>
> arch/x86/kernel/kvmclock.c
> pv_time_ops.sched_clock = kvm_clock_read;
>
> arch/x86/kernel/kvm.c
> pv_time_ops.steal_clock = kvm_steal_clock;
>
>
> (2) However, Xen just implemented sched_clock while the steal_clock is
> still native_steal_clock(). The function native_steal_clock() just
> simply return 0.
>
> arch/x86/xen/time.c
> .sched_clock = xen_clocksource_read;
>
> arch/x86/kernel/paravirt.c
> static u64 native_steal_clock(int cpu)
> {
> return 0;
> }
>
>
> Therefore, even though update_rq_clock_task() calculates the value and
> paravirt_steal_rq_enabled option is enabled, the steal value just
> returns 0. This will cause the problem which I mentioned.
>
> update_rq_clock_task
> --> paravirt_steal_clock
> --> pv_time_ops.steal_clock
> --> native_steal_clock (if in Xen)
> --> 0
>
> The fundamental solution is to implement a steal_clock in Xen(learn
> from KVM implementation) instead of using the native one.
>
> Tony
>
Also, I tried the latest long term version of Linux 4.4, this issue
still exists there. Hoping the next version can add this patch.
Tony
>> -boris
>>
>>> - other ideas? Tony?
>>>
>>> Dario
>>
>>