Re: [PATCH v3 2/2] accel: Add Arm Ethos-U NPU driver

From: Rob Herring
Date: Fri Oct 10 2025 - 17:32:49 EST


On Mon, Sep 29, 2025 at 2:22 PM Frank Li <Frank.li@xxxxxxx> wrote:
>
> On Fri, Sep 26, 2025 at 03:00:49PM -0500, Rob Herring (Arm) wrote:
> > Add a driver for Arm Ethos-U65/U85 NPUs. The Ethos-U NPU has a
> > relatively simple interface with single command stream to describe
> > buffers, operation settings, and network operations. It supports up to 8
> > memory regions (though no h/w bounds on a region). The Ethos NPUs
> > are designed to use an SRAM for scratch memory. Region 2 is reserved
> > for SRAM (like the downstream driver stack and compiler). Userspace
> > doesn't need access to the SRAM.
> >

> ...
> > +
> > +static int ethosu_init(struct ethosu_device *ethosudev)
> > +{
> > + int ret;
> > + u32 id, config;
> > +
> > + ret = devm_pm_runtime_enable(ethosudev->base.dev);
> > + if (ret)
> > + return ret;
> > +
> > + ret = pm_runtime_resume_and_get(ethosudev->base.dev);
> > + if (ret)
> > + return ret;
> > +
> > + pm_runtime_set_autosuspend_delay(ethosudev->base.dev, 50);
> > + pm_runtime_use_autosuspend(ethosudev->base.dev);
>
> pm_runtime_use_autosuspend() should be after last register read
> readl_relaxed(ethosudev->regs + NPU_REG_CONFIG);
>
> incase schedule happen between pm_runtime_use_autosuspend(ethosudev->base.dev);
> and readl().

All the call does is enable autosuspend. I don't think it matters
exactly when we enable it. We already did a get preventing autosuspend
until we do a put.

> > + /* If PM is disabled, we need to call ethosu_device_resume() manually. */
> > + if (!IS_ENABLED(CONFIG_PM)) {
> > + ret = ethosu_device_resume(ethosudev->base.dev);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + ethosudev->npu_info.id = id = readl_relaxed(ethosudev->regs + NPU_REG_ID);
> > + ethosudev->npu_info.config = config = readl_relaxed(ethosudev->regs + NPU_REG_CONFIG);


> ...
> > +
> > +/**
> > + * ethosu_gem_create_with_handle() - Create a GEM object and attach it to a handle.
> > + * @file: DRM file.
> > + * @ddev: DRM device.
> > + * @size: Size of the GEM object to allocate.
> > + * @flags: Combination of drm_ethosu_bo_flags flags.
> > + * @handle: Pointer holding the handle pointing to the new GEM object.
> > + *
> > + * Return: Zero on success
> > + */
> > +int ethosu_gem_create_with_handle(struct drm_file *file,
> > + struct drm_device *ddev,
> > + u64 *size, u32 flags, u32 *handle)
> > +{
> > + int ret;
> > + struct drm_gem_dma_object *mem;
> > + struct ethosu_gem_object *bo;
>
> move 'ret' here to keep reverise christmas tree order.

Is that the order DRM likes? It's got to be the dumbest coding standard we have.

> > +
> > + mem = drm_gem_dma_create(ddev, *size);
> > + if (IS_ERR(mem))
> > + return PTR_ERR(mem);
> > +
> > + bo = to_ethosu_bo(&mem->base);
> > + bo->flags = flags;
> > +
> > + /*
> > + * Allocate an id of idr table where the obj is registered
> > + * and handle has the id what user can see.
> > + */
> > + ret = drm_gem_handle_create(file, &mem->base, handle);
> > + if (!ret)
> > + *size = bo->base.base.size;
> > +
> > + /* drop reference from allocate - handle holds it now. */
> > + drm_gem_object_put(&mem->base);
> > +
> > + return ret;
> > +}
> > +
> ...
> > +
> > +static void cmd_state_init(struct cmd_state *st)
> > +{
> > + /* Initialize to all 1s to detect missing setup */
> > + memset(st, 0xff, sizeof(*st));
> > +}
> > +
> > +static u64 cmd_to_addr(u32 *cmd)
> > +{
> > + return ((u64)((cmd[0] & 0xff0000) << 16)) | cmd[1];
>
> will FIELD_PREP helpful?

Like this?:

return ((u64)FIELD_PREP(GENMASK(23, 16), cmd[0]) << 32) | cmd[1];

Questionable to me if that's better...

>
> > +}
> > +
> > +static u64 dma_length(struct ethosu_validated_cmdstream_info *info,
> > + struct dma_state *dma_st, struct dma *dma)
> > +{
> > + s8 mode = dma_st->mode;
> > + u64 len = dma->len;
> > +
> > + if (mode >= 1) {
> > + len += dma->stride[0];
> > + len *= dma_st->size0;
> > + }
> > + if (mode == 2) {
> > + len += dma->stride[1];
> > + len *= dma_st->size1;
> > + }
> > + if (dma->region >= 0)
> > + info->region_size[dma->region] = max(info->region_size[dma->region],
> > + len + dma->offset);
> > +
> > + return len;
> > +}
> > +
> ...
>
> > +
> > +static void ethosu_job_handle_irq(struct ethosu_device *dev)
> > +{
> > + u32 status;
> > +
> > + pm_runtime_mark_last_busy(dev->base.dev);
>
> I think don't need pm_runtime_mark_last_busy() here because
> pm_runtime_put_autosuspend() already call pm_runtime_mark_last_busy().
>
> only mark last busy without pm_runtime_put() can't affect run time pm
> state, still in active state.

Yes, agreed. Copied from rocket, so Tomeu, you may want to look at that.

>
> > +
> > + status = readl_relaxed(dev->regs + NPU_REG_STATUS);
> > +
> > + if (status & (STATUS_BUS_STATUS | STATUS_CMD_PARSE_ERR)) {
> > + dev_err(dev->base.dev, "Error IRQ - %x\n", status);
> > + drm_sched_fault(&dev->sched);
> > + return;
> > + }
> > +
> > + scoped_guard(mutex, &dev->job_lock) {
> > + if (dev->in_flight_job) {
> > + dma_fence_signal(dev->in_flight_job->done_fence);
> > + pm_runtime_put_autosuspend(dev->base.dev);
> > + dev->in_flight_job = NULL;
> > + }
> > + }
> > +}
> > +
> ...
> > +
> > +int ethosu_job_init(struct ethosu_device *dev)
> > +{
> > + struct drm_sched_init_args args = {
> > + .ops = &ethosu_sched_ops,
> > + .num_rqs = DRM_SCHED_PRIORITY_COUNT,
> > + .credit_limit = 1,
> > + .timeout = msecs_to_jiffies(JOB_TIMEOUT_MS),
> > + .name = dev_name(dev->base.dev),
> > + .dev = dev->base.dev,
> > + };
> > + int ret;
> > +
> > + spin_lock_init(&dev->fence_lock);
> > + mutex_init(&dev->job_lock);
> > + mutex_init(&dev->sched_lock);
>
> now perfer use dev_mutex_init().

$ git grep dev_mutex_init
(END)

Huh??