Re: [PATCH] xfs: convert alloc_workqueue users to WQ_UNBOUND

From: Michal Hocko

Date: Thu Feb 19 2026 - 04:21:17 EST

On Thu 19-02-26 12:24:38, Dave Chinner wrote:
> On Wed, Feb 18, 2026 at 05:56:09PM +0100, Marco Crivellari wrote:
> > Recently, as part of a workqueue refactor, WQ_PERCPU has been added to
> > alloc_workqueue() users that didn't specify WQ_UNBOUND.
> > The change has been introduced by:
> >
> > 69635d7f4b344 ("fs: WQ_PERCPU added to alloc_workqueue users")
> >
> > These specific workqueues don't use per-cpu data, so change the behavior
> > removing WQ_PERCPU and adding WQ_UNBOUND.
>
> Your definition for "doesn't need per-cpu workqueues" is sadly
> deficient.

I believe Marco wanted to say they do not require strict per-cpu
guarantee of WQ_PERCPU for correctness. I.e. those workers do not
operate on per-cpu data.

> > Even if these workqueue are
> > marked unbound, the workqueue subsystem maintains cache locality by
> > default via affinity scopes.
> >
> > The changes from per-cpu to unbound will help to improve situations where
> > CPU isolation is used, because unbound work can be moved away from
> > isolated CPUs.
>
> If you are running operations through the XFS filesystem on isolated
> CPUs, then you absolutely need some of these the per-cpu workqueues
> running on those isolated CPUs too.

The usecase is that isolated workload needs to perform fs operations at
certain stages of the operation. Then it moves over to "do not disturb"
mode when it operates in the userspace and shouldn't be disrupted by the
kernel. We do observe that those workers trigger at later time and
disturb the workload when not appropriate.

> Also, these workqueues are typically implemented these ways to meet
> performancei targets, concurrency constraints or algorithm
> requirements. Changes like this need a bunch of XFS metadata
> scalability benchmarks on high end server systems under a variety of
> conditions to at least show there aren't any obvious any behavioural
> or performance regressions that result from the change.

This is a fair ask. We do not want to regress non-isolated workloads by
any means and if there is a risk of regression for those, and from your
more detailed explanation it seems so, then we might need to search for
a different approach. Would be an opt in - i.e. tolerate performance
loss by loosing the locality via a kernel cmd line an option?

I am cutting your specific feedback on those WQs. Thanks for that! This
is a very valuable feedback.

Thanks!
--
Michal Hocko
SUSE Labs