Re: [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs

From: Jeff Layton
Date: Tue Feb 18 2025 - 19:01:20 EST

Next message: Andrew Morton: "Re: [PATCH][next] mm/mincore: improve performance by adding an unlikely hint"
Previous message: Dionna Amalie Glaze: "Re: [PATCH 0/4] tsm: Unified Measurement Register ABI for TVMs"
In reply to: NeilBrown: " Re: [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 2025-02-19 at 09:26 +1100, NeilBrown wrote:
> On Wed, 19 Feb 2025, Jeff Layton wrote:
> > While looking at the problem that Li Lingfeng reported [1] around
> > callback queueing failures, I noticed that there were potential
> > scenarios where the callback workqueue jobs could run concurrently with
> > an rpc_task. Since they touch some of the same fields, this is incorrect
> > at best and potentially dangerous.
>
> If the problem is that workqueue jobs might run concurrently with
> rpc_tasks and that we don't want that, could we simply run all the cb
> tasks as "sync" rpc tasks in the workqueue??
>
> i.e. change rpc_call_async() in nfsd4_run_cb_work() to rpc_call_sync ...
> and fix any breakage because I doubt it is really as simple as that.
>
> This would fully serialise all the callbacks. Is that what to goal is
> here, or is the goal more subtle?
>

I think we need to serialize callbacks that use the same struct
nfsd4_callback. Today we can end up with the same callback running
concurrently with some of them:

The callback workqueue jobs only run until the rpc_task has been
submitted. After that point the workqueue job can run again and submit
a second rpc_task. That can end up with multiple RPCs racing to set and
fetch the cb_status, cb_slot, etc. in the nfsd4_callback.

Note that a few of the callbacks have their own mechanism for
serialization (CB_RECALL_ANY, CB_GETATTR, and the callback channel
probes), but the others don't.

Making all of the RPCs synchronous would mean that all callbacks to the
client would be serialized, which would be overkill. We'd be going back
to a single-slot callback channel.

I think serializing on the nfsd4_callback is right. The part I'm not
yet certain about is whether to just ignore attempts to queue up the
callback while one is running, or if we need to ensure that it runs
again after the current callback has finished if that happens.

The latter seems more correct, but I can't quite come up with a
situation where it matters. Maybe a multiple CB_LAYOUTRECALL callback
attempts can race with a layout stateid morphing?

>
>
> >
> > This patchset adds a new mechanism for ensuring that the same
> > nfsd4_callback can't run concurrently with itself, regardless of where
> > it is in its execution. This also gives us a more sure mechanism for
> > handling the places where we need to take and hold a reference on an
> > object while the callback is running.
> >
> > This should also fix the problem that Li Lingfeng reported, since
> > queueing the work from nfsd4_cb_release() should never fail. Note that
> > the patch they sent earlier (fdf5c9413ea) should be dropped from
> > nfsd-testing before this will apply cleanly.
> >
> > [1]: https://lore.kernel.org/linux-nfs/20250218135423.1487309-1-lilingfeng3@xxxxxxxxxx/
> >
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> > Jeff Layton (3):
> > nfsd: prevent callback tasks running concurrently
> > nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY
> > nfsd: move cb_need_restart flag into cb_flags
> >
> > fs/nfsd/nfs4callback.c | 12 ++++++------
> > fs/nfsd/nfs4layouts.c | 7 ++++---
> > fs/nfsd/nfs4proc.c | 2 +-
> > fs/nfsd/nfs4state.c | 26 +++++++++++---------------
> > fs/nfsd/state.h | 13 ++++++++++---
> > fs/nfsd/trace.h | 2 +-
> > 6 files changed, 33 insertions(+), 29 deletions(-)
> > ---
> > base-commit: 4a52e5e49d1b50fcb584e434cced6d0547ddea42
> > change-id: 20250218-nfsd-callback-f723b8498c78
> >
> > Best regards,
> > --
> > Jeff Layton <jlayton@xxxxxxxxxx>
> >
> >
>

--
Jeff Layton <jlayton@xxxxxxxxxx>

Next message: Andrew Morton: "Re: [PATCH][next] mm/mincore: improve performance by adding an unlikely hint"
Previous message: Dionna Amalie Glaze: "Re: [PATCH 0/4] tsm: Unified Measurement Register ABI for TVMs"
In reply to: NeilBrown: " Re: [PATCH 0/3] nfsd: don't allow concurrent queueing of workqueue jobs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]