Re: [PATCH v2] gpu: nova-core: gsp: fix undefined behavior in command queue code
From: Alice Ryhl
Date: Thu Mar 26 2026 - 11:56:48 EST
On Thu, Mar 26, 2026 at 12:03:31PM +0000, Gary Guo wrote:
> On Thu Mar 26, 2026 at 4:51 AM GMT, Alexandre Courbot wrote:
> > On Thu Mar 26, 2026 at 1:30 PM JST, Alexandre Courbot wrote:
> >> On Wed Mar 25, 2026 at 12:15 AM JST, Gary Guo wrote:
> >>> On Tue Mar 24, 2026 at 2:44 PM GMT, Alexandre Courbot wrote:
> >>>> On Tue Mar 24, 2026 at 1:44 AM JST, Gary Guo wrote:
> >>>>> On Mon Mar 23, 2026 at 5:40 AM GMT, Alexandre Courbot wrote:
> >>>>>> `driver_read_area` and `driver_write_area` are internal methods that
> >>>>>> return slices containing the area of the command queue buffer that the
> >>>>>> driver has exclusive read or write access, respectively.
> >>>>>>
> >>>>>> While their returned value is correct and safe to use, internally they
> >>>>>> temporarily create a reference to the whole command-buffer slice,
> >>>>>> including GSP-owned regions. These regions can change without notice,
> >>>>>> and thus creating a slice to them is undefined behavior.
> >>>>>>
> >>>>>> Fix this by replacing the slice logic with pointer arithmetic and
> >>>>>> creating slices to valid regions only. It adds unsafe code, but should
> >>>>>> be mostly replaced by `IoView` and `IoSlice` once they land.
> >>>>>>
> >>>>>> Fixes: 75f6b1de8133 ("gpu: nova-core: gsp: Add GSP command queue bindings and handling")
> >>>>>> Reported-by: Danilo Krummrich <dakr@xxxxxxxxxx>
> >>>>>> Closes: https://lore.kernel.org/all/DH47AVPEKN06.3BERUSJIB4M1R@xxxxxxxxxx/
> >>>>>> Signed-off-by: Alexandre Courbot <acourbot@xxxxxxxxxx>
> >>>>>> ---
> >>>>>> I didn't apply Eliot's Reviewed-by because the code has changed
> >>>>>> drastically. The logic should remain identical though.
> >>>>>> ---
> >>>>>> Changes in v2:
> >>>>>> - Use `u32_as_usize` consistently.
> >>>>>> - Reduce the number of `unsafe` blocks by computing the end offset of
> >>>>>> the returned slices and creating them at the end, in one step.
> >>>>>> - Take advantage of the fact that both slices have the same start index
> >>>>>> regardless of the branch chosen.
> >>>>>> - Improve safety comments.
> >>>>>> - Link to v1: https://patch.msgid.link/20260319-cmdq-ub-fix-v1-1-0f9f6e8f3ce3@xxxxxxxxxx
> >>>>>
> >>>>> Here's the diff that fixes the issue using I/O projection
> >>>>> https://lore.kernel.org/rust-for-linux/20260323153807.1360705-1-gary@xxxxxxxxxx/
> >>>>
> >>>> Should we apply or drop this patch meanwhile? I/O projections are still
> >>>> undergoing review, but I'm fine with dropping it if Danilo thinks we can
> >>>> live a bit longer with that UB. It's not like the driver is actively
> >>>> doing anything useful yet anyway.
> >>>
> >>> I want to avoid big changes back and forth. We could use raw pointer projection
> >>> today, which could be fairly easy to convert to I/O projection:
> >>
> >> Thanks for the diff. I have adapted it to work on top of Danilo's
> >> suggestion to compute the end indices first as it works just as well and
> >> is cleaner. I have been running into a link error with this conversion
> >> applied though - let's discuss that on v3.
> >
> > Mmm, I guess this was because the optimizer could not prove that the
> > slices were within the bounds of the command queue as the expressions
> > passed to `ptr::project` were too complex with that version and this
> > makes the `ProjectIndex` check fail. I have better luck when doing
> > something closer to the diff you pasted.
>
> I'm considering switching the projectiong `[]` syntax to become panicking
> instead, given that the slicing use case quite often is indeed hard to prove
> (and also, we already have panicking comments).
>
> One option is to just change `[]` to do that, another option is adding a new
> `[]!` syntax to denote panicking projections. I'm more inclined to just the
> first one to keep consistency with Rust slicing syntax, but the second one is
> okay to me too.
>
> Thoughts?
IMO you should just use [].
Alice