[PATCH v3 2/3] Revert "ALSA: firewire-lib: operate for period elapse event in process context"

From: Edmund Raile
Date: Mon Jul 29 2024 - 17:42:41 EST


Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event
in process context") removed the process context workqueue from
amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove
its overhead.

With RME Fireface 800, this lead to a regression since
Kernels 5.14.0, causing an AB/BA deadlock competition for the
substream lock with eventual system freeze under ALSA operation:

thread 0:
* (lock A) acquire substream lock by
snd_pcm_stream_lock_irq() in
snd_pcm_status64()
* (lock B) wait for tasklet to finish by calling
tasklet_unlock_spin_wait() in
tasklet_disable_in_atomic() in
ohci_flush_iso_completions() of ohci.c

thread 1:
* (lock B) enter tasklet
* (lock A) attempt to acquire substream lock,
waiting for it to be released:
snd_pcm_stream_lock_irqsave() in
snd_pcm_period_elapsed() in
update_pcm_pointers() in
process_ctx_payloads() in
process_rx_packets() of amdtp-stream.c

? tasklet_unlock_spin_wait
</NMI>
<TASK>
ohci_flush_iso_completions firewire_ohci
amdtp_domain_stream_pcm_pointer snd_firewire_lib
snd_pcm_update_hw_ptr0 snd_pcm
snd_pcm_status64 snd_pcm

? native_queued_spin_lock_slowpath
</NMI>
<IRQ>
_raw_spin_lock_irqsave
snd_pcm_period_elapsed snd_pcm
process_rx_packets snd_firewire_lib
irq_target_callback snd_firewire_lib
handle_it_packet firewire_ohci
context_tasklet firewire_ohci

Restore the process context work queue to prevent deadlock
AB/BA deadlock competition for ALSA substream lock of
snd_pcm_stream_lock_irq() in snd_pcm_status64()
and snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed().

commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse
event in process context")

Cc: stable@xxxxxxxxxxxxxxx
Fixes: 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context")
Link: https://lore.kernel.org/r/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simnl@jn3eo7pe642q/
Reported-by: edmund.raile <edmund.raile@xxxxxxxxx>
Closes: https://lore.kernel.org/r/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simnl@jn3eo7pe642q/
Signed-off-by: Edmund Raile <edmund.raile@xxxxxxxxxxxxxx>
---
sound/firewire/amdtp-stream.c | 31 +++++++++++++++++--------------
1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c
index 31201d506a21..a07b0452267d 100644
--- a/sound/firewire/amdtp-stream.c
+++ b/sound/firewire/amdtp-stream.c
@@ -615,16 +615,8 @@ static void update_pcm_pointers(struct amdtp_stream *s,
// The program in user process should periodically check the status of intermediate
// buffer associated to PCM substream to process PCM frames in the buffer, instead
// of receiving notification of period elapsed by poll wait.
- if (!pcm->runtime->no_period_wakeup) {
- if (in_softirq()) {
- // In software IRQ context for 1394 OHCI.
- snd_pcm_period_elapsed(pcm);
- } else {
- // In process context of ALSA PCM application under acquired lock of
- // PCM substream.
- snd_pcm_period_elapsed_under_stream_lock(pcm);
- }
- }
+ if (!pcm->runtime->no_period_wakeup)
+ queue_work(system_highpri_wq, &s->period_work);
}
}

@@ -1864,11 +1856,22 @@ unsigned long amdtp_domain_stream_pcm_pointer(struct amdtp_domain *d,
{
struct amdtp_stream *irq_target = d->irq_target;

- // Process isochronous packets queued till recent isochronous cycle to handle PCM frames.
if (irq_target && amdtp_stream_running(irq_target)) {
- // In software IRQ context, the call causes dead-lock to disable the tasklet
- // synchronously.
- if (!in_softirq())
+ // This function is called in software IRQ context of
+ // period_work or process context.
+ //
+ // When the software IRQ context was scheduled by software IRQ
+ // context of IT contexts, queued packets were already handled.
+ // Therefore, no need to flush the queue in buffer furthermore.
+ //
+ // When the process context reach here, some packets will be
+ // already queued in the buffer. These packets should be handled
+ // immediately to keep better granularity of PCM pointer.
+ //
+ // Later, the process context will sometimes schedules software
+ // IRQ context of the period_work. Then, no need to flush the
+ // queue by the same reason as described in the above
+ if (current_work() != &s->period_work)
fw_iso_context_flush_completions(irq_target->context);
}

--
2.45.2