Packet gets stuck in NOLOCK pfifo_fast qdisc

From: Jonas Bonn
Date: Wed Oct 09 2019 - 02:46:27 EST


Hi,

The lockless pfifo_fast qdisc has an issue with packets getting stuck in the queue. What appears to happen is:

i) Thread 1 holds the 'seqlock' on the qdisc and dequeues packets.
ii) Thread 1 dequeues the last packet in the queue.
iii) Thread 1 iterates through the qdisc->dequeue function again and determines that the queue is empty.

iv) Thread 2 queues up a packet. Since 'seqlock' is busy, it just assumes the packet will be dequeued by whoever is holding the lock.

v) Thread 1 releases 'seqlock'.

After v), nobody will check if there are packets in the queue until a new packet is enqueued. Thereby, the packet enqueued by Thread 2 may be delayed indefinitely.

What, I think, should probably happen is that Thread 1 should check that the queue is empty again after releasing 'seqlock'. I poked at this, but it wasn't obvious to me how to go about this given the way the layering works here. Roughly:

qdisc_run_end() {
...
spin_unlock(seqlock);
if (!qdisc_is_empty(qdisc))
qdisc_run();
...
}

Calling qdisc_run() from qdisc_run_end() doesn't feel right!

There's a qdisc->empty property (and qdisc_is_empty() relies on it) but it's not particularly useful in this case since there's a race in setting this property which makes it not quite reliable.

Hope someone can shine some light on how to proceed here.

/Jonas