Packet gets stuck in NOLOCK pfifo_fast qdisc
From: Jonas Bonn
Date: Wed Oct 09 2019 - 02:46:27 EST
Hi,
The lockless pfifo_fast qdisc has an issue with packets getting stuck in
the queue. What appears to happen is:
i) Thread 1 holds the 'seqlock' on the qdisc and dequeues packets.
ii) Thread 1 dequeues the last packet in the queue.
iii) Thread 1 iterates through the qdisc->dequeue function again and
determines that the queue is empty.
iv) Thread 2 queues up a packet. Since 'seqlock' is busy, it just
assumes the packet will be dequeued by whoever is holding the lock.
v) Thread 1 releases 'seqlock'.
After v), nobody will check if there are packets in the queue until a
new packet is enqueued. Thereby, the packet enqueued by Thread 2 may be
delayed indefinitely.
What, I think, should probably happen is that Thread 1 should check that
the queue is empty again after releasing 'seqlock'. I poked at this,
but it wasn't obvious to me how to go about this given the way the
layering works here. Roughly:
qdisc_run_end() {
...
spin_unlock(seqlock);
if (!qdisc_is_empty(qdisc))
qdisc_run();
...
}
Calling qdisc_run() from qdisc_run_end() doesn't feel right!
There's a qdisc->empty property (and qdisc_is_empty() relies on it) but
it's not particularly useful in this case since there's a race in
setting this property which makes it not quite reliable.
Hope someone can shine some light on how to proceed here.
/Jonas