Re: [PATCH 2/6] sched/vtime: Bring all-in-one kcpustat accessor for vtime fields

From: Ingo Molnar
Date: Wed Nov 20 2019 - 07:04:56 EST



* Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:

> Many callsites want to fetch the values of system, user, user_nice, guest
> or guest_nice kcpustat fields altogether or at least a pair of these.
>
> In that case calling kcpustat_field() for each requested field brings
> unecessary overhead when we could fetch all of them in a row.
>
> So provide kcpustat_cputime() that fetches all vtime sensitive fields
> under the same RCU and seqcount block.
>
> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: Yauheni Kaliuta <yauheni.kaliuta@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Rik van Riel <riel@xxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> ---
> include/linux/kernel_stat.h | 23 ++++++
> kernel/sched/cputime.c | 139 ++++++++++++++++++++++++++++++------
> 2 files changed, 142 insertions(+), 20 deletions(-)
>
> diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
> index 79781196eb25..6bd70e464c61 100644
> --- a/include/linux/kernel_stat.h
> +++ b/include/linux/kernel_stat.h
> @@ -78,15 +78,38 @@ static inline unsigned int kstat_cpu_irqs_sum(unsigned int cpu)
> return kstat_cpu(cpu).irqs_sum;
> }
>
> +
> +static inline void kcpustat_cputime_raw(u64 *cpustat, u64 *user, u64 *nice,
> + u64 *system, u64 *guest, u64 *guest_nice)
> +{
> + *user = cpustat[CPUTIME_USER];
> + *nice = cpustat[CPUTIME_NICE];
> + *system = cpustat[CPUTIME_SYSTEM];
> + *guest = cpustat[CPUTIME_GUEST];
> + *guest_nice = cpustat[CPUTIME_GUEST_NICE];

Could the 'cpustat' pointer be constified?

Also, please:

> + *user = cpustat[CPUTIME_USER];
> + *nice = cpustat[CPUTIME_NICE];
> + *system = cpustat[CPUTIME_SYSTEM];
> + *guest = cpustat[CPUTIME_GUEST];
> + *guest_nice = cpustat[CPUTIME_GUEST_NICE];

More pleasing to look at and easier to verify as well.

> +static int vtime_state_check(struct vtime *vtime, int cpu)
> +{
> + /*
> + * We raced against context switch, fetch the
> + * kcpustat task again.
> + */

s/against context switch
/against a context switch

> +void kcpustat_cputime(struct kernel_cpustat *kcpustat, int cpu,
> + u64 *user, u64 *nice, u64 *system,
> + u64 *guest, u64 *guest_nice)
> +{
> + u64 *cpustat = kcpustat->cpustat;
> + struct rq *rq;
> + int err;
> +
> + if (!vtime_accounting_enabled_cpu(cpu)) {
> + kcpustat_cputime_raw(cpustat, user, nice,
> + system, guest, guest_nice);
> + return;
> + }
> +
> + rq = cpu_rq(cpu);
> +
> + for (;;) {
> + struct task_struct *curr;
> +
> + rcu_read_lock();
> + curr = rcu_dereference(rq->curr);
> + if (WARN_ON_ONCE(!curr)) {
> + rcu_read_unlock();
> + kcpustat_cputime_raw(cpustat, user, nice,
> + system, guest, guest_nice);
> + return;
> + }
> +
> + err = kcpustat_cputime_vtime(cpustat, curr, cpu, user,
> + nice, system, guest, guest_nice);
> + rcu_read_unlock();
> +
> + if (!err)
> + return;
> +
> + cpu_relax();
> + }
> +}
> +EXPORT_SYMBOL_GPL(kcpustat_cputime);

I'm wondering whether it's worth introducing a helper structure for this
train of parameters: user, nice, system, guest, guest_nice?

We also have similar constructs in other places:

+ u64 cpu_user, cpu_nice, cpu_sys, cpu_guest, cpu_guest_nice;

But more broadly, what do we gain by passing along a quartet of pointers,
while we could also just use a 'struct kernel_cpustat' and store the
values there naturally?

Yes, it's larger, because it also has 5 other fields - but we lose much
of the space savings due to always passing along the 4 pointers already.

So I really think the parameter passing should be organized better here.
This probably affects similar cpustat functions as well.

Thanks,

Ingo