Re: [PATCH] sched_ext: Fix potential deadlock in destroy_dsq()

From: Breno Leitao
Date: Fri Jan 17 2025 - 04:46:24 EST


Hello Tejun,

On Thu, Jan 16, 2025 at 04:06:26PM -1000, Tejun Heo wrote:
> On Thu, Jan 16, 2025 at 12:51:01PM +0100, Andrea Righi wrote:
> > When creating and destroying DSQs concurrently, a potential deadlock can
> > occur due to a circular locking dependency between the locks involved in
> > the operations:
> >
> > - create_dsq():
> >
> > rhashtable_bucket --> rq->lock --> dsq->lock
>
> Hmm... this is probably the same thing that Breno tried to fix with
> rhashtable update. Breno, what's the current state of that patch? I saw bug
> reports and fix patch flying by but didn't track them closely.

Right, that seems exactly the problem I fixed. This is the current state
of the issue.

The fix is already in linux-next, but not on linus' tree:

e1d3422c95f00 Breno Leitao : rhashtable: Fix potential deadlock by moving schedule_work outside lock

That fixes caused a regression[1], and Herbert got a patch, which is not
committed in linux-next AFAIK.

This is Herbert's fix:

https://lore.kernel.org/all/Z4XWx5X0doetOJni@xxxxxxxxxxxxxxxxxxx/

[1] Link: https://lore.kernel.org/all/Z4DoFYQ3ytB-wS3-@xxxxxxxxxxxxxxxxxxx/

--breno