Re: [PATCH 1/2] remoteproc: core: Attach rproc asynchronously in rproc_add() path
From: Stephan Gerhold
Date: Tue Apr 14 2026 - 04:13:50 EST
On Tue, Apr 14, 2026 at 11:41:39AM +0800, Jingyi Wang wrote:
>
>
> On 4/10/2026 10:28 PM, Stephan Gerhold wrote:
> > +Cc Bartosz, Dmitry
> >
> > On Thu, Apr 09, 2026 at 01:46:21AM -0700, Jingyi Wang wrote:
> > > For rproc with state RPROC_DETACHED and auto_boot enabled, the attach
> > > callback will be called in the rproc_add()->rproc_trigger_auto_boot()->
> > > rproc_boot() path, the failure in this path will cause the rproc_add()
> > > fail and the resource release, which will cause issue like rproc recovery
> > > or falling back to firmware load fail. Add attach_work for rproc and call
> > > it asynchronously in rproc_add() path like what rproc_start() do.
> > >
> > > Signed-off-by: Jingyi Wang <jingyi.wang@xxxxxxxxxxxxxxxx>
> > > ---
> > > drivers/remoteproc/remoteproc_core.c | 20 ++++++++++++--------
> > > include/linux/remoteproc.h | 1 +
> > > 2 files changed, 13 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > index b087ed21858a..f02db1113fae 100644
> > > --- a/drivers/remoteproc/remoteproc_core.c
> > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > @@ -1673,18 +1673,21 @@ static void rproc_auto_boot_callback(const struct firmware *fw, void *context)
> > > release_firmware(fw);
> > > }
> > > +static void rproc_attach_work(struct work_struct *work)
> > > +{
> > > + struct rproc *rproc = container_of(work, struct rproc, attach_work);
> > > +
> > > + rproc_boot(rproc);
> > > +}
> > > +
> > > static int rproc_trigger_auto_boot(struct rproc *rproc)
> > > {
> > > int ret;
> > > - /*
> > > - * Since the remote processor is in a detached state, it has already
> > > - * been booted by another entity. As such there is no point in waiting
> > > - * for a firmware image to be loaded, we can simply initiate the process
> > > - * of attaching to it immediately.
> > > - */
> > > - if (rproc->state == RPROC_DETACHED)
> > > - return rproc_boot(rproc);
> > > + if (rproc->state == RPROC_DETACHED) {
> > > + schedule_work(&rproc->attach_work);
> > > + return 0;
> > > + }
> >
> > I think the change itself is reasonable to make "auto-attach" behavior
> > consistent with "auto-boot". The commit message is a bit misleading
> > though:
> >
> > - You're really doing two separate functional changes here:
> >
> > (1) Ignore the return value of rproc_boot() during auto-boot attach,
> > to keep the remoteproc registered and available in sysfs even if
> > attaching fails.
> > (2) Run the rproc_boot() in the background using schedule_work().
> > [To improve boot performance? To work around some locking issues?]
> >
> > - The actual issue you are seeing sounds like a use-after-free in the
> > remoteproc core error cleanup path. I think this one is still
> > present, we should really have a call to
> > cancel_work_sync(&rproc->crash_handler) as Dmitry wrote in the
> > previous discussion [1].
> >
> > Thanks,
> > Stephan
> >
> > [1]: https://lore.kernel.org/all/ce24a2sgg4b6wymoxwgl2ve6np2nxn2wuxfqxfpmvqqrpvgouf@xihd6ziqwu4m/
>
> Hi Stephan,
>
> Exactly as you say, what this change do is allowing rproc_attach return false.
> It should be okay to keep this change and describe it more clear in commit msg
> in next version?
>
That's fine for me.
> And the use-after-free issue is what we want to resolve in the patch2
> in this series, I think cancel_work_sync() is a reasonable change
> but it cannot resolve this issue as the worker could be executing when
> we call this(and this is what it behaves when I did local test) and
> the use-after-free issue still exists. Shall we send a separate patch
> for this cancel_work_sync?
>
cancel_work_sync() should wait until the worker execution has finished.
If you call it before freeing the resources (= deleting the remoteproc),
I would expect it should work as expected.
It makes sense to have separate patches for this, one is about fixing
the use-after-free issue, the other is more about the behavior when the
initial auto-boot fails.
Thanks,
Stephan