Re: Is it really safe to use workqueues to drive expedited grace periods?

From: Paul E. McKenney
Date: Mon Feb 13 2017 - 19:16:07 EST


On Sat, Feb 11, 2017 at 11:35:41AM +0900, Tejun Heo wrote:
> Hello, Paul.
>
> On Fri, Feb 10, 2017 at 01:21:58PM -0800, Paul E. McKenney wrote:
> > So RCU's expedited grace periods have been using workqueues for a
> > little while, and things seem to be working. But as usual, I worry...
> > Is this use subject to some sort of deadlock where RCU's workqueue cannot
> > start running until after a grace period completes, but that grace
> > period is the one needing the workqueue? Note that there are ways to
> > set up your kernel so that all RCU grace periods are expedited.
> >
> > Should I be worried? If not, what prevents this from being a problem,
> > especially given that workqueue handlers are allowed to wait for RCU
> > grace periods to complete?
>
> A per-cpu (normal) workqueue's concurrency is regulated automatically
> so that there are at least one worker running for the worker pool on a
> given CPU.
>
> Let's say there are two work items queued on a workqueue. The first
> one is something which will do synchronize_rcu() and the second is the
> expedited grace period work item. When the first one runs
> synchronize_rcu(), it'd block. If there are no other work items
> running at the time, workqueue will dispatch another worker so that
> there's at least one actively running, which in this case will be the
> expedited rcu grace period work item.
>
> The dispatching of a new worker can be delayed by two things - memory
> pressure preventing creation of a new worker and the workqueue hitting
> maximum concurrency limit.
>
> If expedited RCU grace period is something that memory reclaim path
> may depend on, the workqueue that it executes on should have
> WQ_MEM_RECLAIM set, which will guarantee that there's at least one
> worker (across all CPUs) which is ready to serve the work items on
> that workqueue regardless of memory pressure.
>
> The latter, concurrency limit, would only matter if the RCU work items
> use system_wq. system_wq's concurrency limit is very high (512 per
> CPU), but it is theoretically possible to fill all up with work items
> doing synchronize_rcu() with the expedited RCU work item scheduled
> behind it. The system would already be in a very messed up state
> outside the RCU situation tho.

Thank you for the information! So if I am to continue using workqueues
for expedited RCU grace periods, I believe that need to do the following:

1. Use alloc_workqueue() to create my own WQ_MEM_RECLAIM
workqueue.

2. Rework my workqueue handler to avoid blocking waiting for
the expedited grace period to complete. I should be able
to do a small number of timed wait, but if I actually
wait for the grace period to complete, I might end up
hogging the reserved items. (Or does my workqueue supply
them for me? If so, so much the better!)

3. Concurrency would not be a problem -- there can be no more
four work elements in flight across both possible flavors
of expedited grace periods.

Anything I am missing here?

Thanx, Paul