Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW

From: Sebastian Reichel
Date: Wed Oct 25 2023 - 13:28:49 EST


Hi,

On Tue, Oct 24, 2023 at 09:32:27AM -1000, Linus Torvalds wrote:
> [ Since you're cc'ing the s390 people, I assume that means that this
> all ended up being a follow-up to the s390 issue discussion ]

Actually I've noticed the issue in the clock framework some time
ago and Andy Shevchenko pointed me towards the s390 thread, which
is more or less about the same fundamental problem. So I added
everyone in Cc, to make sure this is solved properly and not just
s390 specific.

> On Tue, 24 Oct 2023 at 06:19, Sebastian Reichel
> <sebastian.reichel@xxxxxxxxxxxxx> wrote:
> >
> > Add a new DIV_ROUND_UP helper, which cannot overflow when
> > big numbers are being used.
>
> This is really horrendously bad on some architectures (ie it might
> require *two* divisions):

I came up with this helper for the issue in the clock framework,
which currently use DIV_ROUND_UP_ULL() as a workaround. That solves
the issue on 32 bit systems (input arguments are of type 'unsigned
long'). But it also means doing u64 / u32 and I expect that do be
much more more expensive on 32 bit platforms than two divisons.

On 64 bit platforms the workaround is wrong, since the second
argument is reduced from 64 bit to 32 bit. It effectively breaks
platforms trying to request a frequency above 4.3 GHz from a clock
divider. Since that's not yet a common thing I guess this has not
yet been found and fixed by somebody else. I also just found it as
bycatch.

> > +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d)))
>
> but David Laight at least had a suggestion for that: when allowing
> multiple uses, you can just do
>
> #define DIV_ROUND_UP(n,d) ((n) ? ((n)-1)/(d)+1 : 0)
>
> so now you're back to just one division and no horrible extra expense.
>
> But we can't allow those multiple uses in general, sadly.
>
> I would really prefer to just make our regular DIV_ROUND_UP() DTRT. But:
>
> - people do use it with complex first arguments (ie function calls
> etc) that we don't want to evaluate twice
>
> - we can't make it an inline function, because the types aren't fixed
>
> - we can't even use a statement expression and __auto_type, because
> these things are used in type definitions etc and need to be constant
> expressions
>
> That last thing means that even the "__builtin_choose_expr()" doesn't
> work, because the statement expression still needs to be _parsed_ as
> such, and that's not something we can do in those contexts.
>
> Very annoying. Let me think about this.

Thanks.

-- Sebastian

Attachment: signature.asc
Description: PGP signature