Re: [GIT PULL] s390 fixes for 6.6-rc7

From: Linus Torvalds
Date: Sat Oct 21 2023 - 13:56:58 EST


On Sat, 21 Oct 2023 at 02:44, Vasily Gorbik <gor@xxxxxxxxxxxxx> wrote:
>
> please pull s390 fixes for 6.6-rc7.

Pulled. HOWEVER.

> - Fix IOMMU bitmap allocation in s390 PCI to avoid out of bounds access
> when IOMMU pages aren't a multiple of 64.

Please don't do this kind of thing.

And I quote:

static unsigned long *bitmap_vzalloc(size_t bits, gfp_t flags)
{
size_t n = BITS_TO_LONGS(bits);
size_t bytes;

if (unlikely(check_mul_overflow(n, sizeof(unsigned long), &bytes)))
return NULL;

return vzalloc(bytes);
}

the above overflow handling is *not* "defensive and good programming".

The above is just "unreadable mindless boiler plate".

Seriously, you're taking a 'size_t' of number of bits, turning it into
number of longs, and you're then turning *that* into number of bytes,
AND YOU ADD OVERFLOW CHECKING?!??!!!

Now, to make matters worse, the above calculation can actually
overflow in theory - but not in the place where you added the
protection!

Because the "longs to bytes" sure as hell can't overflow. We know
that, because the number of longs is guaranteed to have a much smaller
range, since it came from a calculation of bits.

But what can actually overflow? BITS_TO_LONGS(bits) will overflow, and
turn ~0ul to 0, because it does the __KERNEL_DIV_ROUND_UP thing, which
is the simplistic

#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))

so that code added overflow protection that doesn't make sense, in all
the wrong places.

You need to verify the sanity of the number of bits first anyway.

Of course, in your use-case, the number of bits is also not unlimited,
because the source is

zdev->iommu_pages = zdev->iommu_size >> PAGE_SHIFT;

so it turns out that no, the BITS_TO_LONGS() won't overflow either,
but at least in some other situations - and only looking at that
bitmap_vzalloc() in a vacuum - it *could* have.

Now, I will argue that you always need range checking on the number of
bits *anyway* for other reasons - trying to just blindly allocate some
random amount of memory isn't acceptable, so there should to be some
range checking before anyway.

But that code is wrong, because the overflow is simply not an issue.
Adding overflow handling code is literally only actively misleading,
making the code harder to read, for no reason, and making people
*think* it's being careful when it is anything *but* careful.

I suspect that the compiler actually sees "that is stupid" and turns
the overflow into just a single left-shift again because it has seen
the (bigger) right-shift and knows it cannot overflow, but the problem
I'm ranting against is mindlessly adding boiler plate code that makes
the code harder to read for *humans*.

If you *do* want to add proper overflow handling, you'd need to either
fix BITS_TO_LONGS() some way (which is actually non-trivial since it
needs to be able to stay a constant and only use the argument once),
or you do something like

if (!bits)
return ZERO_SIZE_PTR;
longs = BITS_TO_LONG(bits);
if (!longs)
return NULL;
return vzalloc(longs * sizeof(long));

and I'd suggest maybe we should

(a) do the above checking in our bitmap_alloc() routines

(b) also change our bitmap_alloc() routines to take 'size_t' instead
of 'unsigned int' bit counts

(c) and finally, add that vzalloc() case, but simply using

kvmalloc_array(n, size, flags | __GFP_ZERO);

instead.

(And yes, kvmalloc_array() will actually then also do that
check_mul_overflow() thing, but now it's not pointless boiler plate
any more, now it's actually meaningful for *other* cases than the
bitmap allocation one that cannot overflow).

Hmm?

Linus