Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

From: Mike Galbraith
Date: Tue Apr 04 2017 - 23:21:18 EST


On Tue, 2017-04-04 at 15:39 -0700, Cong Wang wrote:

> Thanks for the report! Looks like a quick solution here is to replace
> this yield() with cond_resched(), it is harder to really wait for
> all qdisc's to transmit all packets.

No, cond_resched() won't help. What I did is below, but I suspect net
wizards will do something better.

---
net/sched/sch_generic.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -16,6 +16,7 @@
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/sched.h>
+#include <linux/swait.h>
#include <linux/string.h>
#include <linux/errno.h>
#include <linux/netdevice.h>
@@ -901,6 +902,7 @@ static bool some_qdisc_is_busy(struct ne
*/
void dev_deactivate_many(struct list_head *head)
{
+ DECLARE_SWAIT_QUEUE_HEAD_ONSTACK(swait);
struct net_device *dev;
bool sync_needed = false;

@@ -924,8 +926,7 @@ void dev_deactivate_many(struct list_hea

/* Wait for outstanding qdisc_run calls. */
list_for_each_entry(dev, head, close_list)
- while (some_qdisc_is_busy(dev))
- yield();
+ swait_event_timeout(swait, !some_qdisc_is_busy(dev), 1);
}

void dev_deactivate(struct net_device *dev)