Re: [PATCH 2/6] posix-cpu-timers: Don't start process wide cputime counter if timer is disabled
From: Peter Zijlstra
Date: Wed Jun 16 2021 - 04:51:40 EST
On Fri, Jun 04, 2021 at 01:31:55PM +0200, Frederic Weisbecker wrote:
> If timer_settime() is called with a 0 expiration on a timer that is
> already disabled, the process wide cputime counter will be started
> and won't ever get a chance to be stopped by stop_process_timer() since
> no timer is actually armed to be processed.
>
> This process wide counter might bring some performance hit due to the
> concurrent atomic additions at the thread group scope.
>
> The following snippet is enough to trigger the issue.
>
> void trigger_process_counter(void)
> {
> timer_t id;
> struct itimerspec val = { };
>
> timer_create(CLOCK_PROCESS_CPUTIME_ID, NULL, &id);
> timer_settime(id, TIMER_ABSTIME, &val, NULL);
> timer_delete(id);
> }
>
> So make sure we don't needlessly start it.
>
> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
> ---
> kernel/time/posix-cpu-timers.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
> index aa52fc85dbcb..132fd56fb1cd 100644
> --- a/kernel/time/posix-cpu-timers.c
> +++ b/kernel/time/posix-cpu-timers.c
> @@ -632,10 +632,15 @@ static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
> * times (in arm_timer). With an absolute time, we must
> * check if it's already passed. In short, we need a sample.
> */
> - if (CPUCLOCK_PERTHREAD(timer->it_clock))
> + if (CPUCLOCK_PERTHREAD(timer->it_clock)) {
> val = cpu_clock_sample(clkid, p);
> - else
> - val = cpu_clock_sample_group(clkid, p, true);
> + } else {
> + /*
> + * Sample group but only start the process wide cputime counter
> + * if the timer is to be enabled.
> + */
> + val = cpu_clock_sample_group(clkid, p, !!new_expires);
> + }
The cpu_timer_enqueue() is in arm_timer() and the condition for calling
that is:
'new_expires != 0 && val < new_expires'
Which is not the same as the one you add.
I'm thinking the fundamental problem here is the disconnect between
cpu_timer_enqueue() and pct->timers_active ?