Re: [GIT PULL] dma-mapping fix for 5.10

From: Linus Torvalds
Date: Sat Oct 31 2020 - 15:51:11 EST


On Sat, Oct 31, 2020 at 2:40 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>
> dma-mapping fix for 5.10:
>
> - fix an integer overflow on 32-bit platforms in the new DMA range code
> (Geert Uytterhoeven)

So this is just a stylistic nit, and has no impact on this pull (which
I've done). But looking at the patch, it triggers one of my "this is
wrong" patterns.

In particular, this:

u64 dma_start = 0;
...
for (dma_start = ~0ULL; r->size; r++) {

is actually completely bogus in theory, and it's a horribly horribly
bad pattern to have.

The thing that I hate about that parttern is "~0ULL", which is simply wrong.

The correct pattern for "all bits set" is ~0. NOTHING ELSE. No extra
letters at the end.

Why? Because using an unsigned type is wrong, and will not extend the
bits up to a potentially bigger size.

So adding that "ULL" is not just three extra characters to type, it
actually _detracts_ from the code and makes it more fragile and
potentially wrong.

It so happens, that yes, in the kernel, "ull" us 64-bit, and you get
the right results. So the code _works_. But it's wrong, and it now
requires that the types match exactly (ie it would not be broken if
somebody ever were to say "I want to use use 128-bit dma addresses and
u128").

Another example is using "~0ul", which would give different results on
a 32-bit kernel and a 64-bit kernel. Again: DON'T DO THAT.

I repeat: the right thing to do for "all bits set" is just a plain ~0
or -1. Either of those are fine (technically assumes a 2's complement
machine, but let's just be honest: that's a perfectly fine assumption,
and -1 might be preferred by some because it makes that sign extension
behavior of the integer constant more obvious).

Don't try to do anything clever or anything else, because it's going
to be strictly worse.

The old code that that patch removed was "technically correct", but
just pointless, and actually shows the problem:

for (dma_start = ~(dma_addr_t)0; r->size; r++) {

the above is indeed a correct way to say "I want all bits set in a
dma_addr_t", but while correct, it is - once again - strictly inferior
to just using "~0".

Why? Because "~0" works regardless of type. IOW, exactly *because*
people used the wrong pattern for "all bits set", that patch was now
(a) bigger than necessary and (b) much more ilkely to cause bugs (ie I
could have imagined people changing just the type of the variable
without changing the initialization).

So in that tiny three-line patch there were actually several examples
of why "~0" is the right pattern to use for "all bits set". Because it
JustWorks(tm) in ways other patterns do not.

And if you have a compiler that complains about assigning -1 or ~0 to
an unsigned variable, get rid of that piece of garbage. You're almost
certainly either using some warning flag that you shouldn't be using,
or the compiler writer didn't know what they were doing.

Linus