Re: [RFC PATCH] rcu: move SRCU grace period work to power efficient workqueue

From: Mike Galbraith
Date: Sat Feb 15 2014 - 02:37:28 EST

Next message: Julia Lawall: "[PATCH 1/3] staging: r8712u: delete unnecessary field initialization"
Previous message: Julia Lawall: "[PATCH 2/3] staging: r8188eu: delete unnecessary field initialization"
In reply to: Kevin Hilman: "Re: [RFC PATCH] rcu: move SRCU grace period work to power efficient workqueue"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, 2014-02-14 at 15:24 -0800, Kevin Hilman wrote:
> Tejun Heo <tj@xxxxxxxxxx> writes:
>
> > Hello,
> >
> > On Wed, Feb 12, 2014 at 11:02:41AM -0800, Paul E. McKenney wrote:
> >> +2. Use the /sys/devices/virtual/workqueue/*/cpumask sysfs files
> >> + to force the WQ_SYSFS workqueues to run on the specified set
> >> + of CPUs. The set of WQ_SYSFS workqueues can be displayed using
> >> + "ls sys/devices/virtual/workqueue".
> >
> > One thing to be careful about is that once published, it becomes part
> > of userland visible interface. Maybe adding some words warning
> > against sprinkling WQ_SYSFS willy-nilly is a good idea?
>
> In the NO_HZ_FULL case, it seems to me we'd always want all unbound
> workqueues to have their affinity set to the housekeeping CPUs.
>
> Is there any reason not to enable WQ_SYSFS whenever WQ_UNBOUND is set so
> the affinity can be controlled? I guess the main reason would be that
> all of these workqueue names would become permanent ABI.
>
> At least for NO_HZ_FULL, maybe this should be automatic. The cpumask of
> unbound workqueues should default to !tick_nohz_full_mask? Any WQ_SYSFS
> workqueues could still be overridden from userspace, but at least the
> default would be sane, and help keep full dyntics CPUs isolated.

What I'm thinking is that it should be automatic, but not necessarily
based upon the nohz full mask, rather maybe based upon whether sched
domains exist, or perhaps a generic exclusive cpuset property, though
some really don't want anything to do with cpusets.

Why? Because there are jitter intolerant loads where nohz full isn't all
that useful, because you'll be constantly restarting and stopping the
tick, and eating the increased accounting overhead to no gain because
there are frequently multiple realtime tasks running. For these loads
(I have a user with such a fairly hefty 80 core rt load), dynamically
turning the tick _on_ is currently a better choice than nohz_full.
Point being, control of where unbound workqueues are allowed to run
isn't only desirable for single task HPC loads, other loads exist.

For my particular fairly critical 80 core load, workqueues aren't a real
big hairy deal, because its jitter tolerance isn't _all_ that tight (30
us max is easy enough to meet with room to spare). The load can slice
through workers well enough to meet requirements, but it would certainly
be a win to be able to keep them at bay. (gonna measure it, less jitter
is better even if it's only a little bit better.. eventually somebody
will demand what's currently impossible to deliver)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Julia Lawall: "[PATCH 1/3] staging: r8712u: delete unnecessary field initialization"
Previous message: Julia Lawall: "[PATCH 2/3] staging: r8188eu: delete unnecessary field initialization"
In reply to: Kevin Hilman: "Re: [RFC PATCH] rcu: move SRCU grace period work to power efficient workqueue"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]