Re: [PATCH v2 net-next] net: link_watch: prevent starvation when processing linkwatch wq
From: Yunsheng Lin
Date: Sun Jun 02 2019 - 21:23:58 EST
On 2019/5/31 17:54, Salil Mehta wrote:
>> From: netdev-owner@xxxxxxxxxxxxxxx On Behalf Of Yunsheng Lin
>> Sent: Friday, May 31, 2019 10:01 AM
>> To: davem@xxxxxxxxxxxxx
>> Cc: hkallweit1@xxxxxxxxx; f.fainelli@xxxxxxxxx;
>> stephen@xxxxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-
>> kernel@xxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>
>> Subject: [PATCH v2 net-next] net: link_watch: prevent starvation when
>> processing linkwatch wq
>>
>> When user has configured a large number of virtual netdev, such
>> as 4K vlans, the carrier on/off operation of the real netdev
>> will also cause it's virtual netdev's link state to be processed
>> in linkwatch. Currently, the processing is done in a work queue,
>> which may cause cpu and rtnl locking starvation problem.
>>
>> This patch releases the cpu and rtnl lock when link watch worker
>> has processed a fixed number of netdev' link watch event.
>>
>> Currently __linkwatch_run_queue is called with rtnl lock, so
>> enfore it with ASSERT_RTNL();
>
>
> Typo enfore --> enforce ?
My mistake.
thanks.
>
>
>
>> Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
>> ---
>> V2: use cond_resched and rtnl_unlock after processing a fixed
>> number of events
>> ---
>> net/core/link_watch.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/net/core/link_watch.c b/net/core/link_watch.c
>> index 7f51efb..07eebfb 100644
>> --- a/net/core/link_watch.c
>> +++ b/net/core/link_watch.c
>> @@ -168,9 +168,18 @@ static void linkwatch_do_dev(struct net_device
>> *dev)
>>
>> static void __linkwatch_run_queue(int urgent_only)
>> {
>> +#define MAX_DO_DEV_PER_LOOP 100
>> +
>> + int do_dev = MAX_DO_DEV_PER_LOOP;
>> struct net_device *dev;
>> LIST_HEAD(wrk);
>>
>> + ASSERT_RTNL();
>> +
>> + /* Give urgent case more budget */
>> + if (urgent_only)
>> + do_dev += MAX_DO_DEV_PER_LOOP;
>> +
>> /*
>> * Limit the number of linkwatch events to one
>> * per second so that a runaway driver does not
>> @@ -200,6 +209,14 @@ static void __linkwatch_run_queue(int urgent_only)
>> }
>> spin_unlock_irq(&lweventlist_lock);
>> linkwatch_do_dev(dev);
>> +
>
>
> A comment like below would be helpful in explaining the reason of the code.
>
> /* This function is called with rtnl_lock held. If excessive events
> * are present as part of the watch list, their processing could
> * monopolize the rtnl_lock and which could lead to starvation in
> * other modules which want to acquire this lock. Hence, co-operative
> * scheme like below might be helpful in mitigating the problem.
> * This also tries to be fair CPU wise by conditional rescheduling.
> */
Yes, thanks for the helpful comment.
>
>
>> + if (--do_dev < 0) {
>> + rtnl_unlock();
>> + cond_resched();
>> + do_dev = MAX_DO_DEV_PER_LOOP;
>> + rtnl_lock();
>> + }
>> +
>> spin_lock_irq(&lweventlist_lock);
>> }
>
> .
>