Re: [PATCH v12 15/22] gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
From: Eliot Courtney
Date: Tue Jun 02 2026 - 08:22:38 EST
On Tue Jun 2, 2026 at 12:21 PM JST, John Hubbard wrote:
> FSP communication uses a pair of non-circular queues in the FSP
> falcon's EMEM, one for messages from the driver to FSP and one for
> replies, with the driver polling for response data. Add the queue
> registers and the low-level helpers used by the higher-level FSP
> message layer.
>
> Signed-off-by: John Hubbard <jhubbard@xxxxxxxxxx>
> ---
> drivers/gpu/nova-core/falcon/fsp.rs | 61 ++++++++++++++++++++++++++++-
> drivers/gpu/nova-core/regs.rs | 21 ++++++++++
> 2 files changed, 80 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/nova-core/falcon/fsp.rs b/drivers/gpu/nova-core/falcon/fsp.rs
> index 6b057d958115..0ec1c55213bc 100644
> --- a/drivers/gpu/nova-core/falcon/fsp.rs
> +++ b/drivers/gpu/nova-core/falcon/fsp.rs
> @@ -112,7 +112,6 @@ impl Falcon<Fsp> {
> ///
> /// `data` is interpreted as little-endian 32-bit words. Returns `EINVAL`
> /// if `offset` or the `data` length is not 4-byte aligned.
> - #[expect(dead_code)]
> fn write_emem(&mut self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
> if offset % 4 != 0 || data.len() % 4 != 0 {
> return Err(EINVAL);
> @@ -131,7 +130,6 @@ fn write_emem(&mut self, bar: &Bar0, offset: u32, data: &[u8]) -> Result {
> ///
> /// `data` is stored as little-endian 32-bit words. Returns `EINVAL` if
> /// `offset` or the `data` length is not 4-byte aligned.
> - #[expect(dead_code)]
> fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
> if offset % 4 != 0 || data.len() % 4 != 0 {
> return Err(EINVAL);
> @@ -145,4 +143,63 @@ fn read_emem(&mut self, bar: &Bar0, offset: u32, data: &mut [u8]) -> Result {
>
> Ok(())
> }
> +
> + /// Poll FSP for incoming data.
> + ///
> + /// Returns the size of available data in bytes, or 0 if no data is available.
> + ///
> + /// The FSP message queue is not circular. Pointers are reset to 0 after each
> + /// message exchange, so `tail >= head` is always true when data is present.
> + #[expect(dead_code)]
> + pub(crate) fn poll_msgq(&self, bar: &Bar0) -> u32 {
> + let head = bar.read(regs::NV_PFSP_MSGQ_HEAD).address();
> + let tail = bar.read(regs::NV_PFSP_MSGQ_TAIL).address();
> +
> + if head == tail {
> + return 0;
> + }
> +
> + // TAIL points at last DWORD written, so add 4 to get total size
> + tail.saturating_sub(head) + 4
> + }
In a later patch, `send_sync_fsp` polls this then calls `recv_msg`. But,
structurally it's possible to pass in any size to `recv_msg` and read
more than we are supposed to. What about having `recv_msg` do the
polling to get the size and return a KVec with the read out data,
instead of `send_sync_fsp`? `poll_msgq` could stay private and we can
make it public later if we need to.
> +
> + /// Writes `packet` to FSP EMEM and updates the queue pointers to notify FSP.
> + ///
> + /// Returns `EINVAL` if `packet` is empty or its length is not 4-byte aligned.
> + #[expect(dead_code)]
> + pub(crate) fn send_msg(&mut self, bar: &Bar0, packet: &[u8]) -> Result {
> + if packet.is_empty() {
> + return Err(EINVAL);
> + }
> +
> + // Write message to EMEM at offset 0 (validates 4-byte alignment)
> + self.write_emem(bar, 0, packet)?;
> +
> + // Update queue pointers. TAIL points at the last DWORD written.
> + let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| EINVAL)?;
> + bar.write_reg(regs::NV_PFSP_QUEUE_TAIL::zeroed().with_address(tail_offset));
> + bar.write_reg(regs::NV_PFSP_QUEUE_HEAD::zeroed().with_address(0));
> +
> + Ok(())
> + }
> +
> + /// Reads `size` bytes from FSP EMEM into `buffer` and resets the queue pointers.
> + ///
> + /// `size` comes from `poll_msgq`. Returns `EINVAL` if `size` is 0, exceeds
> + /// `buffer`, or is not 4-byte aligned.
> + #[expect(dead_code)]
> + pub(crate) fn recv_msg(&mut self, bar: &Bar0, buffer: &mut [u8], size: usize) -> Result {
> + if size == 0 || size > buffer.len() {
> + return Err(EINVAL);
> + }
> +
> + // Read response from EMEM at offset 0 (validates 4-byte alignment)
> + self.read_emem(bar, 0, &mut buffer[..size])?;
> +
> + // Reset message queue pointers after reading
> + bar.write_reg(regs::NV_PFSP_MSGQ_TAIL::zeroed().with_address(0));
> + bar.write_reg(regs::NV_PFSP_MSGQ_HEAD::zeroed().with_address(0));
> +
> + Ok(())
> + }
I think we can remove the `size` argument and have the caller pass in
an appropriately sized slice (altho obviated by my other comment).