Re: workqueues and percpu (was: [PATCH] dm: remake of the veritytarget)

From: Andrew Morton
Date: Thu Mar 08 2012 - 18:30:56 EST


On Thu, 8 Mar 2012 15:15:21 -0800
Tejun Heo <tj@xxxxxxxxxx> wrote:

> > I'm not sure what we can do about it really, apart from blocking unplug
> > until all the target CPU's workqueues have been cleared. And/or refusing
> > to unplug a CPU until all pinned-to-that-cpu kernel threads have been
> > shut down or pinned elsewhere (which is the same thing, only more
> > general).
> >
> > Tejun, is this new behaviour? I do recall that a long time ago we
> > wrestled with unplug-vs-worker-threads and I ended up OK with the
> > result, but I forget what it was. IIRC Rusty was involved.
>
> Unfortunately, yes, this is a new behavior. Before, we could have
> unbound delays during unplug from work items. Now, we have CPU
> affinity assumption breakage.

Ow, didn't know that.

> The behavior change was primarily to
> allow long running work items to use regular workqueues without
> worrying about inducing delay across cpu hotplug operations, which is
> important as it's also used on suspend / hibernation, especially on
> mobile platforms.

Well.. why did we want to support these long-running work items?
They're abusive, aren't they? Where are they?

> During the cmwq conversion, I ended up auditing a lot of (I think I
> went through most of them) workqueue users and IIRC there weren't too
> many which required stable affinity.
>
> > That being said, I don't think it's worth compromising the DM code
> > because of this workqueue wart: lots of other code has the same wart,
> > and we should find a centralised fix for it.
>
> Probably the best way to solve this is introducing pinned attribute to
> workqueues and have them drained automatically on cpu hotplug events.
> It'll require auditing workqueue users but I guess we'll just have to
> do it given that we need to actually distinguish the ones need to be
> pinned.

That will make future use of workqueues more complex and people will
screw it up.

> Or maybe we can use explicit queue_work_on() to distinguish
> the ones which require pinning.
>
> Another approach would be requiring all workqueues to be drained on
> cpu offlining and requiring any work item which may stall to use
> unbound wq. IMHO, picking out the ones which may stall would be much
> less obvious than the ones which require cpu pinning.

I'd be surprised if it's *that* hard to find and fix the long-running
work items. Hopefully most of them are already using
create_freezable_workqueue() or create_singlethread_workqueue().

I wonder if there's some debug code we can put in workqueue.c to detect
when a pinned work item takes "too long".

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/