Re: Divide-by-zero in post_init_entity_util_avg

From: Peter Zijlstra
Date: Thu Jun 16 2016 - 12:16:33 EST


On Thu, Jun 16, 2016 at 02:25:04PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 16, 2016 at 10:50:40AM +0200, Peter Zijlstra wrote:
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index f75930bdd326..3fd3d903e6b6 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -2878,6 +2878,20 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq)
> > }
> > }
> >
> > +/*
> > + * Explicitly do a load-store to ensure the temporary value never hits memory.
> > + * This allows lockless observations without ever seeing the negative values.
> > + *
> > + * Incidentally, this also generates much saner code for x86.
> > + */
> > +#define sub_positive(type, ptr, val) do { \
> > + type tmp = READ_ONCE(*ptr); \
> > + tmp -= (val); \
> > + if (tmp < 0) \
> > + tmp = 0; \
> > + WRITE_ONCE(*ptr, tmp); \
> > +} while (0)
> > +
> > /* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */
> > static inline int
> > update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq)
> > @@ -2887,15 +2901,15 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq)
> >
> > if (atomic_long_read(&cfs_rq->removed_load_avg)) {
> > s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
> > - sa->load_avg = max_t(long, sa->load_avg - r, 0);
> > - sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
> > + sub_positive(long, &sa->load_avg, r);
> > + sub_positive(s64, &sa->load_sum, r * LOAD_AVG_MAX);
>
> Hmm, so either we should change these variables to signed types as
> forced here, or this logic (along with the former) is plain wrong.
>
> As it stands any unsigned value with the MSB set will wipe the field
> after this subtraction.
>
> I suppose instead we'd want something like:
>
> tmp = READ_ONCE(*ptr);
> if (tmp > val)
> tmp -= val;
> else
> tmp = 0;
> WRITE_ONCE(*ptr, tmp);

Stackoverflow suggested this pattern for (unsigned) underflow checking:

r = a - b;
if ((r = a - b) > a)
underflow()

should generate the right asm, but no, that doesn't work either.

http://stackoverflow.com/questions/24958469/subtract-and-detect-underflow-most-efficient-way-x86-64-with-gcc