Re: [PATCH v2] gpu: nova-core: gsp: fix undefined behavior in command queue code

From: Gary Guo

Date: Thu Mar 26 2026 - 08:06:12 EST


On Thu Mar 26, 2026 at 4:51 AM GMT, Alexandre Courbot wrote:
> On Thu Mar 26, 2026 at 1:30 PM JST, Alexandre Courbot wrote:
>> On Wed Mar 25, 2026 at 12:15 AM JST, Gary Guo wrote:
>>> On Tue Mar 24, 2026 at 2:44 PM GMT, Alexandre Courbot wrote:
>>>> On Tue Mar 24, 2026 at 1:44 AM JST, Gary Guo wrote:
>>>>> On Mon Mar 23, 2026 at 5:40 AM GMT, Alexandre Courbot wrote:
>>>>>> `driver_read_area` and `driver_write_area` are internal methods that
>>>>>> return slices containing the area of the command queue buffer that the
>>>>>> driver has exclusive read or write access, respectively.
>>>>>>
>>>>>> While their returned value is correct and safe to use, internally they
>>>>>> temporarily create a reference to the whole command-buffer slice,
>>>>>> including GSP-owned regions. These regions can change without notice,
>>>>>> and thus creating a slice to them is undefined behavior.
>>>>>>
>>>>>> Fix this by replacing the slice logic with pointer arithmetic and
>>>>>> creating slices to valid regions only. It adds unsafe code, but should
>>>>>> be mostly replaced by `IoView` and `IoSlice` once they land.
>>>>>>
>>>>>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
>>>>>> Reported-by: Danilo Krummrich <dakr@xxxxxxxxxx>
>>>>>> Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@xxxxxxxxxx/
>>>>>> Signed-off-by: Alexandre Courbot <acourbot@xxxxxxxxxx>
>>>>>> ---
>>>>>> I didn't apply Eliot's Reviewed-by because the code has changed
>>>>>> drastically. The logic should remain identical though.
>>>>>> ---
>>>>>> Changes in v2:
>>>>>> - Use `u32_as_usize` consistently.
>>>>>> - Reduce the number of `unsafe` blocks by computing the end offset of
>>>>>> the returned slices and creating them at the end, in one step.
>>>>>> - Take advantage of the fact that both slices have the same start index
>>>>>> regardless of the branch chosen.
>>>>>> - Improve safety comments.
>>>>>> - Link to v1: https://patch.msgid.link/20260319-cmdq-ub-fix-v1-1-0f9f6e8f3ce3@xxxxxxxxxx
>>>>>
>>>>> Here's the diff that fixes the issue using I/O projection
>>>>> https://lore.kernel.org/rust-for-linux/20260323153807.1360705-1-gary@xxxxxxxxxx/
>>>>
>>>> Should we apply or drop this patch meanwhile? I/O projections are still
>>>> undergoing review, but I'm fine with dropping it if Danilo thinks we can
>>>> live a bit longer with that UB. It's not like the driver is actively
>>>> doing anything useful yet anyway.
>>>
>>> I want to avoid big changes back and forth. We could use raw pointer projection
>>> today, which could be fairly easy to convert to I/O projection:
>>
>> Thanks for the diff. I have adapted it to work on top of Danilo's
>> suggestion to compute the end indices first as it works just as well and
>> is cleaner. I have been running into a link error with this conversion
>> applied though - let's discuss that on v3.
>
> Mmm, I guess this was because the optimizer could not prove that the
> slices were within the bounds of the command queue as the expressions
> passed to `ptr::project` were too complex with that version and this
> makes the `ProjectIndex` check fail. I have better luck when doing
> something closer to the diff you pasted.

I'm considering switching the projectiong `[]` syntax to become panicking
instead, given that the slicing use case quite often is indeed hard to prove
(and also, we already have panicking comments).

One option is to just change `[]` to do that, another option is adding a new
`[]!` syntax to denote panicking projections. I'm more inclined to just the
first one to keep consistency with Rust slicing syntax, but the second one is
okay to me too.

Thoughts?

Best,
Gary