Re: [PATCH v4 2/2] io_uring: Add support for napi_busy_poll
From: Olivier Langlois
Date: Tue Mar 01 2022 - 15:06:56 EST
On Wed, 2022-03-02 at 02:31 +0800, Hao Xu wrote:
>
> > + ne = kmalloc(sizeof(*ne), GFP_NOWAIT);
> > + if (!ne)
> > + goto out;
>
> IMHO, we need to handle -ENOMEM here, I cut off the error handling
> when
>
> I did the quick coding. Sorry for misleading.
If you are correct, I would be shocked about this.
I did return in my 'Linux Device Drivers' book and nowhere it is
mentionned that the kmalloc() can return something else than a pointer
No mention at all about the return value
in man page:
https://www.kernel.org/doc/htmldocs/kernel-api/API-kmalloc.html
API doc:
https://www.kernel.org/doc/html/latest/core-api/mm-api.html?highlight=kmalloc#c.kmalloc
header file:
https://elixir.bootlin.com/linux/latest/source/include/linux/slab.h#L522
I did browse into the kmalloc code. There is a lot of paths to cover
but from preliminary reading, it pretty much seems that kmalloc only
returns a valid pointer or NULL...
/**
* kmem_cache_alloc - Allocate an object
* @cachep: The cache to allocate from.
* @flags: See kmalloc().
*
* Allocate an object from this cache. The flags are only relevant
* if the cache has no available objects.
*
* Return: pointer to the new object or %NULL in case of error
*/
/**
* __do_kmalloc - allocate memory
* @size: how many bytes of memory are required.
* @flags: the type of memory to allocate (see kmalloc).
* @caller: function caller for debug tracking of the caller
*
* Return: pointer to the allocated memory or %NULL in case of error
*/
I'll need someone else to confirm about possible kmalloc() return
values with perhaps an example
I am a bit skeptic that something special needs to be done here...
Or perhaps you are suggesting that io_add_napi() returns an error code
when allocation fails.
as done here:
https://elixir.bootlin.com/linux/latest/source/arch/alpha/kernel/core_marvel.c#L867
If that is what you suggest, what would this info do for the caller?
IMHO, it wouldn't help in any way...
>
> >
> > @@ -7519,7 +7633,11 @@ static int __io_sq_thread(struct io_ring_ctx
> > *ctx, bool cap_entries)
> > !(ctx->flags & IORING_SETUP_R_DISABLED))
> > ret = io_submit_sqes(ctx, to_submit);
> > mutex_unlock(&ctx->uring_lock);
> > -
> > +#ifdef CONFIG_NET_RX_BUSY_POLL
> > + if (!list_empty(&ctx->napi_list) &&
> > + io_napi_busy_loop(&ctx->napi_list))
>
> I'm afraid we may need lock for sqpoll too, since io_add_napi() could
> be
> in iowq context.
>
> I'll take a look at the lock stuff of this patch tomorrow, too late
> now
> in my timezone.
Ok, please do. I'm not a big user of io workers. I may have omitted to
consider this possibility.
If that is the case, I think that this would be very easy to fix by
locking the spinlock while __io_sq_thread() is using napi_list.
>
> How about:
>
> if (list is singular) {
>
> do something;
>
> return;
>
> }
>
> while (!io_busy_loop_end() && io_napi_busy_loop())
>
> ;
>
is there a concern with the current code?
What would be the benefit of your suggestion over current code?
To me, it seems that if io_blocking_napi_busy_loop() is called, a
reasonable expectation would be that some busy looping is done or else
you could return the function without doing anything which would, IMHO,
be misleading.
By definition, napi_busy_loop() is not blocking and if you desire the
device to be in busy poll mode, you need to do it once in a while or
else, after a certain time, the device will return back to its
interrupt mode.
IOW, io_blocking_napi_busy_loop() follows the same logic used by
napi_busy_loop() that does not call loop_end() before having perform 1
loop iteration.
> Btw, start_time seems not used in singular branch.
I know. This is why it is conditionally initialized.
Greetings,