Re: [PATCH v3] fuse: optional FORGET delivery over io_uring

From: Joanne Koong

Date: Mon Apr 27 2026 - 08:26:06 EST


On Sun, Apr 26, 2026 at 4:48 PM Bernd Schubert <bernd@xxxxxxxxxxx> wrote:
>
> On 4/24/26 22:38, Joanne Koong wrote:
> > On Thu, Apr 23, 2026 at 4:11 AM Li Wang <liwang@xxxxxxxxxx> wrote:
> >>
> >> Deliver FUSE_FORGET through fuse_uring_queue_fuse_req() when the io_uring
> >> is ready and userspace has opted in by setting
> >> FUSE_IO_URING_REGISTER_FORGET_COMMIT in fuse_uring_cmd_req.flags on
> >> FUSE_IO_URING_CMD_REGISTER. Until any REGISTER
> >> carries that bit, FORGET continues to use the legacy
> >> fuse_dev_queue_forget() path even while io_uring is active, so unmodified
> >> userspace (e.g. libfuse that does not issue a completion SQE for FORGET)
> >> does not wedge ring entries.
> >>
> >> Benefits:
> >> - FORGET can share the same commit/fetch loop as other opcodes.
> >> - Reduces split transport for high-volume forgets when the ring is primary.
> >> - Reuses existing per-queue io-uring machinery and noreply/force
> >> request setup.
> >>
> >> Signed-off-by: Li Wang <liwang@xxxxxxxxxx>
> >
> > Hi Li,
> >
> > Thanks for sending this. To be completely honest, I'm not convinced
> > delivering forget over io-uring is worth the added complexity/cost. In
> > the /dev/fuse path we rely on forget batching/amoritizing and explicit
> > scheduling/fairness logic so forget processing makes progress and
> > doesn't get drowned out by regular requests; I think we'd likely need
> > something comparable for the io-uring path as well. Additionally,
> > routing it through io-uring makes forget behave more like a "real"
> > request on the ring (it needs per-request state to live until
> > userspace completes and the entry can be recyycled) which introduces
> > extra allocation/lifetime management on this path and it requires a
> > uapi change and corresponding libfuse changes.
> >
> > Forgets would consume ring entries but they're tiny one-way
> > notifications and imo I don't think they benefit much from io-uring's
> > main advantages (eg data-path/zero-copy). I worry theyy could contend
> > with read-write heavy traffic where ring capacity is more valuable.
>
> I think when FORGET starts to disturb writes or reads, there also must
> be some metadata load that causes many of these requests. Going via
> /dev/fuse also includes another two syscalls and cpu task switch to the
> libfuse thread handling them - that is not for free either.

I think the /dev/fuse cost is already amortized by batching forgets
and the io-uring path would have its own per-forget overhead (eg
needing to reply back to the forget request which would be a syscall,
needing fuse_req and fuse_forget_uring_data allocations). I agree
benchmarks would be useful to make a conclusion about performance.

>
> However, this currently might disturb reads/writes, if the queue depth
> is limited and memory optimized to carry reads/writes. The missing
> feature here is to have a multiple request sizes on the same queue.
>
> Btw, syscall overhead is basically the reason why I wouldn't like to
> have multiple rings per request size, but one ring with entries of
> different sizes. Will try to respond to the other mail later today.

I'll keep an eye out for your response to that email. I think we
disagree on this but maybe we can discuss this at the fuse BoF next
week .

Thanks,
Joanne
>