Re: [PATCH rfc] workqueue: honour cond_resched() more effectively.

From: Trond Myklebust
Date: Mon Nov 09 2020 - 08:50:46 EST


On Mon, 2020-11-09 at 09:00 +0100, Peter Zijlstra wrote:
> On Mon, Nov 09, 2020 at 01:54:59PM +1100, NeilBrown wrote:
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 4418f5cb8324..728870965df1 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1784,7 +1784,12 @@ static inline int
> > test_tsk_need_resched(struct task_struct *tsk)
> >  #ifndef CONFIG_PREEMPTION
> >  extern int _cond_resched(void);
> >  #else
> > -static inline int _cond_resched(void) { return 0; }
> > +static inline int _cond_resched(void)
> > +{
> > +       if (current->flags & PF_WQ_WORKER)
> > +               workqueue_cond_resched();
> > +       return 0;
> > +}
> >  #endif
> >  
> >  #define cond_resched() ({                      \
>
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 9a2fbf98fd6f..5b2e38567a0c 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -5620,6 +5620,8 @@ SYSCALL_DEFINE0(sched_yield)
> >  #ifndef CONFIG_PREEMPTION
> >  int __sched _cond_resched(void)
> >  {
> > +       if (current->flags & PF_WQ_WORKER)
> > +               workqueue_cond_resched();
> >         if (should_resched(0)) {
> >                 preempt_schedule_common();
> >                 return 1;
>
>
> Much hate for this.. :/ cond_resched() should be a NOP on !PREEMPT
> and
> you wreck that. Also, you call into that workqueue_cond_resched()
> unconditionally, even when it wouldn't have rescheduled, which seems
> very wrong too.
>
> On top of all that, you're adding an extra load to the funcion :/
>
> At some poine Paul tried to frob cond_resched() for RCU and ran into
> all
> sorts of performance issues, I'm thinking this will too.
>
>
> Going by your justification for all this:
>
> > I think that once a worker calls cond_resched(), it should be
> > treated as
> > though it was run from a WQ_CPU_INTENSIVE queue, because only cpu-
> > intensive
> > tasks need to call cond_resched().  This would allow other workers
> > to be
> > scheduled.
>
> I'm thinking the real problem is that you're abusing workqueues. Just
> don't stuff so much work into it that this becomes a problem. Or
> rather,
> if you do, don't lie to it about it.

If we can't use workqueues to call iput_final() on an inode, then what
is the point of having them at all?

Neil's use case is simply a file that has managed to accumulate a
seriously large page cache, and is therefore taking a long time to
complete the call to truncate_inode_pages_final(). Are you saying we
have to allocate a dedicated thread for every case where this happens?

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx