Re: [PATCH] sched: Fix stop_one_cpu_nowait() vs hotplug

From: Peter Zijlstra
Date: Wed Oct 11 2023 - 09:26:47 EST


On Wed, Oct 11, 2023 at 03:24:19AM +0000, Kuyo Chang (張建文) wrote:
> On Tue, 2023-10-10 at 22:04 +0200, Peter Zijlstra wrote:
> >
> > External email : Please do not click links or open attachments until
> > you have verified the sender or the content.
> > On Tue, Oct 10, 2023 at 04:57:47PM +0200, Peter Zijlstra wrote:
> > > On Tue, Oct 10, 2023 at 02:40:22PM +0000, Kuyo Chang (張建文) wrote:
> >
> > > > It is running good so far(more than a week)on hotplug/set
> > affinity
> > > > stress test. I will keep it testing and report back if it happens
> > > > again.
> > >
> > > OK, I suppose I should look at writing a coherent Changelog for
> > this
> > > then...
> >
> > Something like the below... ?
> >
> Thanks for illustrate the race scenario. It looks good to me.
> But how about RT? Does RT also need this invocations as below?
>
> ---
> kernel/sched/rt.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index e93b69ef919b..6aaf0a3d6081 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2063,9 +2063,11 @@ static int push_rt_task(struct rq *rq, bool
> pull)
> */
> push_task = get_push_task(rq);
> if (push_task) {
> + preempt_disable();
> raw_spin_rq_unlock(rq);
> stop_one_cpu_nowait(rq->cpu, push_cpu_stop,
> push_task, &rq->push_work);
> + preempt_enable();
> raw_spin_rq_lock(rq);
> }
>
> @@ -2402,9 +2404,11 @@ static void pull_rt_task(struct rq *this_rq)
> double_unlock_balance(this_rq, src_rq);
>
> if (push_task) {
> + preempt_disable();
> raw_spin_rq_unlock(this_rq);
> stop_one_cpu_nowait(src_rq->cpu, push_cpu_stop,
> push_task, &src_rq-
> >push_work);
> + preempt_enable();
> raw_spin_rq_lock(this_rq);
> }
> }

bah, clearly git-grep didn't work for me last night, I'll go fix up.