RE: [PATCH v2] tty: n_gsm: avoid call of sleeping functions from atomic context
From: Starke, Daniel
Date: Wed Oct 05 2022 - 06:49:05 EST
> >> spin_lock_irqsave(&gsm->tx_lock, flags) // taken a spinlock on TX data
> >> con_write(...)
> >> do_con_write(...)
> >> console_lock()
> >> might_sleep() // -> bug
> >>
> >> As far as console_lock() might sleep it should not be called with
> >> spinlock held.
> >>
> >> The patch replaces tx_lock spinlock with mutex in order to avoid the
> >> problem.
> >>
> >
> > Do you have any hints why this might be correct?
> >
>
> The thing you've pointed out is actually interesting. Mutex works well in
> gsmld_write() but apparently I've missed the other contexts like in
> gsmld_receive_buf().
This patch breaks packet retransmission. Basically tx_lock and now tx_mutex
protects the transmission packet queue. This works fine as long as packets
are transmitted in a context that allows sleep. However, the retransmission
timer T2 is called from soft IRQ context and spans an additional atomic
context via control_lock within gsm_control_retransmit(). The call path
looks like this:
gsm_control_retransmit()
spin_lock_irqsave(&gsm->control_lock, flags)
gsm_control_transmit()
gsm_data_queue()
mutex_lock(&gsm->tx_mutex) // -> sleep in atomic context
I found this issue while merging our keep alive function.
Long story short: The patch via mutex does not solve the issue. It is only
shifted to another function. I suggest splitting the TX lock into packet
queue lock and underlying tty write mutex.
I would have implemented the patch if I had means to verify it.
Best regards,
Daniel Starke