Re: [RFC PATCH v1 2/2] virtio_net: Don't disable napi on low memory.
From: Mike Waychison
Date: Wed Jan 04 2012 - 21:47:19 EST
On Wed, Jan 4, 2012 at 4:31 PM, Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:
> On Wed, 04 Jan 2012 14:52:38 -0800, Mike Waychison <mikew@xxxxxxxxxx> wrote:
>> Currently, the virtio_net driver deals with low memory memory by kicking
>> off delayed work in process context to try and refill the rx queues.
>> This process-context filling is synchronized against the receive
>> bottom-half by serially:
>>
>> - disabling NAPI polling on the device,
>> - allocating buffers,
>> - enqueueing the buffers,
>> - re-enabling napi.
>>
>> Disabling NAPI just to synchronize virtio_add_buf calls is a bit
>> overkill, and there isn't any reason we shouldn't be able to continue
>> processing received packets as they come in. In the simple case, this
>> does not involve allocating any memory, and in fact allows memory to be
>> released as the guest system receives and processes packets.
>
> Adding a spinlock to the fastpath for a weird case in the slow path is
> bad too.
>
> I dislike several things about this patch:
> 1) You seem to leak memory if you allocate a batch then don't add it all.
> 2) You have a module parameter, which I guarantee noone else will ever use.
> 3) You've made three changes at once: allocating outside the lock,
> batching, and replacing the napi lock with a spinlock.
> 4) You use the skb data for the linked list; use the skb head's list.
>
> Instead, here's how I think it should be done:
>
> 1) Split alloc and add, but make alloc return the skb.
> 2) Make try_fill_recv() only called in the receive path, keep it
> basically the same (no batching).
> 3) Add a new try_fill_recv_batch() for the workqueue and init paths.
> This should try to allocate max - num, though in practice the
> first alloc would probably kick off reclaim and take forever,
> then by the time it's done, the problem is solved. Since it's
> reading max and num outside the lock, write this loop carefully!
> 4) try_fill_recv_batch() should then stop napi and add as many as it
> can, then restart it, then free any remainder.
This sounds reasonable to me. I'll see what I can muster together this week.
Thanks for taking a look :)
>
> Cheers,
> Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/