Re: [PATCH v6 1/4] sgl_alloc_order: remove 4 GiB limit, sgl_free() warning

From: Douglas Gilbert
Date: Mon Jan 18 2021 - 15:10:36 EST

On 2021-01-18 1:28 p.m., Jason Gunthorpe wrote:
On Mon, Jan 18, 2021 at 11:30:03AM -0500, Douglas Gilbert wrote:

After several flawed attempts to detect overflow, take the fastest
route by stating as a pre-condition that the 'order' function argument
cannot exceed 16 (2^16 * 4k = 256 MiB).

That doesn't help, the point of the overflow check is similar to
overflow checks in kcalloc: to prevent the routine from allocating
less memory than the caller might assume.

For instance ipr_store_update_fw() uses request_firmware() (which is
controlled by userspace) to drive the length argument to
sgl_alloc_order(). If userpace gives too large a value this will
corrupt kernel memory.

So this math:

nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);

But that check itself overflows if order is too large (e.g. 65).
A pre-condition says that the caller must know or check a value
is sane, and if the user space can have a hand in the value passed
the caller _must_ check pre-conditions IMO. A pre-condition also
implies that the function's implementation will not have code to
check the pre-condition.

My "log of both sides" proposal at least got around the overflowing
left shift problem. And one reviewer, Bodo Stroesser, liked it.

Needs to be checked, add a precondition to order does not help. I
already proposed a straightforward algorithm you can use.

It does help, it stops your proposed check from being flawed :-)

Giving a false sense of security seems more dangerous than a
pre-condition statement IMO. Bart's original overflow check (in
the mainline) limits length to 4GB (due to wrapping inside a 32
bit unsigned).

Also note there is another pre-condition statement in that function's
definition, namely that length cannot be 0.

So perhaps you, Bart Van Assche and Bodo Stroesser, should compare
notes and come up with a solution that you are _all_ happy with.
The pre-condition works for me and is the fastest. The 'length'
argument might be large, say > 1 GB [I use 1 GB in testing but
did try 4GB and found the bug I'm trying to fix] but having
individual elements greater than say 32 MB each does not
seem very practical (and fails on the systems that I test with).
In my testing the largest element size is 4 MB.

Doug Gilbert