Re: [PATCH 8/8] RFC: use a TASK_FIFO kthread for read completion support

From: Christoph Hellwig

Date: Wed Apr 15 2026 - 01:56:29 EST

On Tue, Apr 14, 2026 at 10:23:13AM +0800, Gao Xiang wrote:
> All softirq IO completion already works like this although
> softirq tasks are not strictly called "RT tasks" (i.e. a non-RT
> task issues the IO, and the softirq IO completion will interrupt
> all ongoing tasks).
>
> Basically what we want is to get a non-atomic context instead of
> using the current softirq context for read post-processing and
> switch to the task context immediately as you said, because:
>
> - Our post-processing needs to work in task contexts since
> advanced features like compression deduplication need it;
>
> - Even regardless of our specific requirement needing task
> contexts, using a dedicated task context for read
> post-processing is much better than run in the original
> softirq context:
>
> - Algorithmic work could take extra time (especially slow
> LZMA algorithm could take milliseconds on low devices
> (however, we need a common workflow for all algorithms,
> including fast algorithms like lz4) and verify work for
> example); and long processing time will interfere with
> other remaining softirq tasks like sound-playback
> / network softirqs;
>
> - If it is then deferred to softirqd, it just makes this
> latency issue _worse_.

Yes, and the same applies to a lot of other things. A very similar
algorithmic issue is the checksum validation, be that T10-PI or
file system native checksums. We don't want to run them from
soft/hardirq context obviously, but we really need them to preempt
other work on the cpu, and avoid scheduling latency. The case
that started this is a bit different - folio invalidation mostly
needs user context to take sleeping locks, but those are usually
uncontented. Again, getting this work done ASAP as readers
are synchronously waiting is important.

>> Tejun asked Sandeep if setting the workqueue thread priority to
>> -19 through sysfs (i.e. making them higher priority than normal
>> kernel threads) had the same effect on latency as using a dedicated
>> per-cpu RT task thread. THere was no followup.
>
> I think the issue is that people are not already working on the
> same topic:
>
> - Unlike large subsystems like XFS, people don't already work on
> EROFS unless they have new requirements or urgent production
> issues;
>
> - The original latency issue was already considered as "done" in
> 2023, and I'm not sure if Sandeep could have the bandwidth to
> pause his current work and test more setups according to this
> ongoing discussion in 2026.

Which unfortunately might explain the sad state of Android. You
really need dedicated people around and help to improve core
infrastructure, instead of adding random Kconfig choices.