Re: [PATCH RESEND] sched/fair: Fix overflow in vruntime_eligible()

From: Peter Zijlstra

Date: Mon May 04 2026 - 11:03:43 EST


On Mon, May 04, 2026 at 03:16:09PM +0200, Heiko Carstens wrote:
> On Mon, May 04, 2026 at 01:22:39PM +0200, Peter Zijlstra wrote:
> > On Fri, May 01, 2026 at 12:40:06PM +0200, Peter Zijlstra wrote:
> >
> > > Anyway, I had a poke around with godbolt, and the below seems to
> > > generate the best code for things like x86_64 and arm64.
> > >
> > > Specifically, the __builtin_mul_overflow() already has to compute the
> > > 128 bit product anyway for most architectures, so using that directly
> > > then leads to saner asm and easier to understand code.
> > >
> > > AFAICT HPPA64 is the only 64bit architecture that doesn't implement
> > > __int128 and will thus be demoted to doing what we do on 32bit.
> >
> > I forgot we had ARCH_SUPPORTS_INT128, and I suppose this had better
> > check that. Now, s390 is a bit weird and excludes GCC even though that
> > definitely supports __int128. Supposedly there was a issue, but perhaps
> > modern GCC has this fixed?
>
> The reason was not a bug (in terms of incorrect code), but gcc generated a
> larger than 6kb stack frame for one of the crypto functions - see commit
> fbac266f095d ("s390: select ARCH_SUPPORTS_INT128"). That's just too large to
> be acceptable.
>
> If I remember correctly gcc generated code which did not reuse known to be
> unused stack slots, but created for every variable a new stack slot, for
> whatever reason. Which then resulted in such a huge stack frame. With clang
> the stack frame size was only 1,5kb.
>
> I just checked: with gcc 15.2.0 we are down to 4.5kb. Still too large :)
>
> Adding s390 compiler folks; but I seem to remember I discussed that back then
> with them.

Right, I had indeed found that commit, but since it was from 2023, I had
some hope that gcc-1[56] might have it resolved already. Oh well.