Re: [PATCH v6 00/20] Proxy Execution: A generalized form of Priority Inheritance v6

From: Qais Yousef
Date: Sat Dec 16 2023 - 22:08:27 EST


Hi John

On 11/06/23 19:34, John Stultz wrote:
> Stabilizing this Proxy Execution series has unfortunately
> continued to be a challenging task. Since the v5 release, I’ve
> been focused on getting the deactivated/sleeping owner enqueuing
> functionality, which I left out of v5, stabilized. I’ve managed
> to rework enough to avoid the crashes previously tripped with the
> locktorture & ww_mutex selftests, so I feel it’s much improved,
> but I do still see some issues (blocked waitqueues and hung task
> watchdogs firing) after stressing the system for many hours in a
> 64 cpu qemu environment (though this issue seems to be introduced
> earlier in the series with proxy-migration/return-migration).
>
> I still haven’t had time to focus on testing and debugging the
> chain migration patches. So I’ve left that patch out of this
> submission for now, but will try to get it included in the next
> revision.
>
> This patch series is actually coarser than what I’ve been
> developing with, as there are a number of small “test” steps to
> help validate behavior I changed, which would then be replaced by
> the real logic afterwards. Including those here would just cause
> more work for reviewers, so I’ve folded them together, but if
> you’re interested you can find the fine-grained tree here:
> https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v6-6.6-fine-grained
> https://github.com/johnstultz-work/linux-dev.git proxy-exec-v6-6.6-fine-grained
>
> As mentioned previously, this Proxy Execution series has a long
> history: First described in a paper[1] by Watkins, Straub,
> Niehaus, then from patches from Peter Zijlstra, extended with
> lots of work by Juri Lelli, Valentin Schneider, and Connor
> O'Brien. (and thank you to Steven Rostedt for providing
> additional details here!)

Thanks a lot for all your effort into trying to push this difficult patchset
forward!

I am trying to find more time to help with review and hopefully debugging too,
but as it stands, I think to make progress we need to think about breaking this
patchset into smaller problems and get them merged into phases so at least the
review and actual work done would be broken down into smaller more manageable
chunks.

>From my very birds eye view it seems we have 3 elements:

1. Extend locking infrastructure.
2. Split task context into scheduling and execution.
3. Actual proxy_execution implementation.

It seems to me (and as ever I could be wrong of course) the first 7 patches are
more or less stable? Could you send patch 1 individually and the next 6 patches
to get the ground work to extend locking reviewed and merged first?

After that we can focus on splitting the task context into scheduling and
execution (and maybe introduce the PROXY_EXEC config along with it) but without
actually implementing the inheritance, etc parts? Just generally teaching the
scheduler these now are 2 separate parts.

Are 1 and 2 dependent on each other or can be sent as two series in parallel
actually?

Hopefully this should reduce the work a lot from continuously rebasing these
patches and focus on the last part which is the meaty and most difficult bit
IIUC. Which I hope we can break down too; but I have no idea at the moment how
to do that.

Merging in parts will help with giving each part a chance to soak individually
in mainline while the rest is being discussed. Which would make handling
potential fall overs easier too.

I don't know what the other thinks, but IMHO if there are stable parts of this
series; I think we should focus on trying to merge these elements first. I hope
you'll be the one to get this through the finishing line, but if for whatever
reason yet another person needs to carry over, they'd find something got merged
at least :-) I'm sure reviving these patches every time is no easy task!


Cheers

--
Qais Yousef