Re: [PATCH] accel/rocket: Add per-task flags and interrupt mask for flexible job handling
From: Heiko Stübner
Date: Tue Feb 17 2026 - 02:01:25 EST
Am Montag, 16. Februar 2026, 19:38:19 Mitteleuropäische Normalzeit schrieb Ross Cawston:
> The Rocket NPU supports multiple task types:
> - Convolutional workloads that use CNA, Core, and DPU blocks
> - Standalone post-processing (PPU) tasks such as pooling and element-wise operations
> - Pipelined DPU→PPU workloads
>
> The current driver has several limitations that prevent correct execution of
> non-convolutional workloads and multi-core operation:
>
> - CNA and Core S_POINTER registers are always initialized, re-arming them
> with stale state from previous jobs and corrupting standalone DPU/PPU tasks.
> - Completion is hard-coded to wait only for DPU interrupts, causing PPU-only
> or DPU→PPU pipeline jobs to time out.
> - Ping-pong mode is unconditionally enabled, which is unnecessary for
> single-task jobs.
> - Non-zero cores hang because the vendor-specific "extra bit" (bit 28 × core
> index) in S_POINTER is not set; the BSP sets this via MMIO because userspace
> cannot know which core the scheduler will select.
> - Timeout and IRQ debugging information is minimal.
>
> This patch introduces two new per-task fields to struct rocket_task:
>
> - u32 int_mask: specifies which block completion interrupts signal task done
> (DPU_0|DPU_1 for convolutional/standalone DPU, PPU_0|PPU_1 for PPU tasks).
> Zero defaults to DPU_0|DPU_1 for backward compatibility.
> - u32 flags: currently used for ROCKET_TASK_NO_CNA_CORE to indicate standalone
> DPU/PPU tasks that must not touch CNA/Core state.
>
> Additional changes:
> - Only initialize CNA and Core S_POINTER (with the required per-core extra bit)
> when ROCKET_TASK_NO_CNA_CORE is not set.
> - Set the per-core extra bit via MMIO to fix hangs on non-zero cores.
> - Enable ping-pong mode only when the job contains multiple tasks.
> - Mask and clear interrupts according to the task's int_mask.
> - Accept both DPU and PPU completion interrupts in the IRQ handler.
> - Minor error-path fix in GEM object creation (check error after unlocking
> mm_lock).
>
> These changes, derived from vendor BSP behavior, enable correct execution
> of PPU-only tasks, pipelined workloads, and reliable multi-core operation
> while preserving backward compatibility.
Missing a Signed-off-by line.
Please see
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#developer-s-certificate-of-origin-1-1
Heiko
> ---
> drivers/accel/rocket/rocket_gem.c | 2 +
> drivers/accel/rocket/rocket_job.c | 99 +++++++++++++++++++++++++------
> drivers/accel/rocket/rocket_job.h | 2 +
> include/uapi/drm/rocket_accel.h | 30 ++++++++++
> 4 files changed, 115 insertions(+), 18 deletions(-)