Re: [PATCH] usercopy: use unsigned long instead of uintptr_t

From: Matthew Wilcox
Date: Thu Jun 16 2022 - 12:46:02 EST


On Thu, Jun 16, 2022 at 08:59:51AM -0700, Linus Torvalds wrote:
> On Thu, Jun 16, 2022 at 8:21 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > I don't know why people call uintptr_t a "userspace type". It's a type
> > invented by C99 that is an integer type large enough to hold a pointer.
> > Which is exactly what we want here.
>
> On the other hand, "unsigned long" has existed since the first version
> of C, and is an integer type large enough to hold a pointer.
>
> Which is exactly what we want here, and what we use everywhere else too.
>
> The whole "uintptr_t handles the historical odd cases with pointers
> that are smaller than a 'long'" is entirely irrelevant, since those
> odd cases are just not a factor.

I don't care about the odd historical cases either.

> And the "pointers are _larger_ than a 'long'" case is similarly
> irrelevant, since we very much want to use arithmetic on these things,
> and they are 'long' later. They aren't used as pointers, they are used
> as integer indexes into the virtual address space that we do odd
> operations on.
>
> Honestly, even if you believe in the 128-bit pointer thing, changing
> one cast in one random place to be different from everything else is
> *not* productive. We're never going to do some "let's slowly migrate
> from one to the other".
>
> And honestly, we're never going to do that using "uintptr_t" anyway,
> since it would be based on a kernel config variable and be a very
> specific type, and would need to be type-safe for any sane conversion.
>
> IOW, in a hypothetical word where the address size is larger than the
> word-size, it would have to be something like out "pte_t", which is
> basically wrapped in a struct so that C implicit type conversion
> doesn't bite you in the arse.

I don't want to support an address space larger than word size. I can't
imagine any CPU vendor saying "So we have these larger registers that
you can only use for pointers and then these smaller registers that you
can use for data". We haven't had A/D register splits since the m68k.
Perhaps I haven't talked to enough crazy CPU people. But if anyone does
propose something that bad, we should laugh at them.

So how do you think we should solve the 128-bit-word-size problem?
Leave int at 32-bit, promote long to 128-bit and get the compiler to
add __int64 to give us a 64-bit type?

> We use the user-space types in a few places, and they have caused
> problems, but at least they are really traditional and the compiler
> actually enforces them for some really standard functions. I'm looking
> at 'size_t' in particular, which caused problems exactly because it's
> a type that is strictly speaking not under our control.
>
> 'size_t' is actually a great example of why 'uintptr_t' is a horrid
> thing. It's effectively a integer type that is large enough to hold a
> pointer difference. On unsegmented architectures, that ends up being a
> type large enough to hold a pointer.

The only reason I like size_t is that it's good _documentation_.
It says "This integer is a byte count of something that's in memory".
As opposed to being a count of sectors, blocks, pages, pointers or
turtles.

As an example:
extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
unsigned int, unsigned int);

What the hell are those two ints? Based on experience, they're probably
offset & length, but who even knows what order they're in.

> And does it sound familiar how on some architectures it's "unsigned
> int", and on others it is "unsigned long"? It's very annoying, and
> it's been annoying over the years. The latest annoyance was literally
> less than a week ago in 1c27f1fc1549 ("iov_iter: fix build issue due
> to possible type mis-match").

Yes, ARM is just crazy here. We should get the compiler people to give
us an option to make size_t unsigned long. Like we have -mcmodel=kernel
on x86.