Re: [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
From: sashiko-bot
Date: Mon Jun 01 2026 - 23:36:08 EST
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Race condition due to incorrect order of writing queue pointers in `send_msg`.
- [High] Silent failure when sending a 4-byte message.
- [High] Unchecked integer addition in `poll_msgq` can cause a kernel panic on untrusted MMIO reads.
--
commit 8a59b49984011320aa631b20a05550242e2baef2
Author: John Hubbard <jhubbard@xxxxxxxxxx>
gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
This commit adds the foundation security processor (FSP) message
infrastructure for the nova-core GPU driver. It implements queue
registers and low-level helpers to communicate with the FSP falcon's EMEM.
> diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
> index 6b057d958115a..0ec1c55213bc8 100644
> --- a/drivers/gpu/nova-core/falcon/fsp.rs
> +++ b/drivers/gpu/nova-core/falcon/fsp.rs
[ ... ]
> @@ -145,4 +143,63 @@ fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
>
> Ok(())
> }
> +
> + /// Poll FSP for incoming data.
> + ///
> + /// Returns the size of available data in bytes, or 0 if no data is available.
> + ///
> + /// The FSP message queue is not circular. Pointers are reset to 0 after each
> + /// message exchange, so `tail >= head` is always true when data is present.
> + #[expect(dead_code)]
> + pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
> + let head = bar.read(regs::NV_PFSP_MSGQ_HEAD).address();
> + let tail = bar.read(regs::NV_PFSP_MSGQ_TAIL).address();
> +
> + if head == tail {
> + return 0;
> + }
> +
> + // TAIL points at last DWORD written, so add 4 to get total size
> + tail.saturating_sub(head) + 4
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Severity: High]
Can this unchecked addition panic if we encounter a PCIe link failure?
If `head` reads normally as 0 but a surprise removal occurs before reading
`tail`, `tail` could read as 0xFFFFFFFF. The saturating subtraction would
leave 0xFFFFFFFF, and the subsequent `+ 4` will overflow, causing a panic in
debug builds where CONFIG_RUST_OVERFLOW_CHECKS=y is set.
Should this use `saturating_add(4)` or another form of checked math to handle
untrusted hardware inputs safely?
> + }
> +
> + /// Writes `packet` to FSP EMEM and updates the queue pointers to notify FSP.
> + ///
> + /// Returns `EINVAL` if `packet` is empty or its length is not 4-byte aligned.
> + #[expect(dead_code)]
> + pub(crate) fn send_msg(&mut self, bar: &Bar0, packet: &[u8]) -> Result {
> + if packet.is_empty() {
> + return Err(EINVAL);
> + }
> +
> + // Write message to EMEM at offset 0 (validates 4-byte alignment)
> + self.write_emem(bar, 0, packet)?;
> +
> + // Update queue pointers. TAIL points at the last DWORD written.
> + let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Severity: High]
Does this create a silent failure if a 4-byte message is sent?
If `packet.len()` is 4, `tail_offset` is calculated as 0. Both NV_PFSP_QUEUE_HEAD
and NV_PFSP_QUEUE_TAIL will then be written as 0. Because the FSP protocol treats
`HEAD == TAIL` as an empty queue, the FSP will ignore the message while the
function still returns `Ok(())`.
Should this enforce a minimum packet length of 8 bytes or return an error
for 4-byte messages?
> + bar.write_reg(regs::NV_PFSP_QUEUE_TAIL::zeroed().with_address(tail_offset));
> + bar.write_reg(regs::NV_PFSP_QUEUE_HEAD::zeroed().with_address(0));
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Severity: High]
Is there a race condition with the FSP hardware due to the order of these
writes?
By writing NV_PFSP_QUEUE_TAIL before NV_PFSP_QUEUE_HEAD, if QUEUE_HEAD was
non-zero from a prior transaction, we momentarily create a state where
`TAIL != HEAD`. The FSP may wake up immediately and read from the old, incorrect
QUEUE_HEAD offset before the driver can write 0 to QUEUE_HEAD.
Should NV_PFSP_QUEUE_HEAD be initialized to 0 before writing the
NV_PFSP_QUEUE_TAIL doorbell?
> +
> + Ok(())
> + }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260602032111.224790-1-jhubbard@xxxxxxxxxx?part=15