Re: [RFC PATCH v2 1/7] block: Extand commit_rqs() to do batch processing

From: Sagi Grimberg
Date: Fri May 08 2020 - 18:19:53 EST


Hey Ming,

Would it make sense to elevate this flag to a request_queue flag
(QUEUE_FLAG_ALWAYS_COMMIT)?

request queue flag usually is writable, however this case just needs
one read-only flag, so I think it may be better to make it as
tagset/hctx flag.

I actually intended it to be writable.

I'm thinking of a possibility that an I/O scheduler may be used
to activate this functionality rather than having the driver set
it necessarily...

Could you explain a bit why I/O scheduler should activate this
functionality?

Sure, I've recently seen some academic work showing the benefits
of batching in tcp/ip based block drivers. The problem with the
approaches taken is that I/O scheduling is exercised deep down in the
driver, which is not the direction I'd like to go if we are want
to adopt some of the batching concepts.

I spent some (limited) time thinking about this, and it seems to
me that there is an opportunity to implement this as a dedicated
I/O scheduler, and tie it to driver specific LLD stack optimizations
(net-stack for example) relying on the commit_rq/bd->last hints.

When scanning the scheduler code, I noticed exactly the phenomenon that
this patchset is attempting to solve and Christoph referred me to it.
Now I'm thinking if we can extend this batching optimization for both
use-cases.

batching submission may be good for some drivers, and currently
we only do it in limited way. One reason is that there is extra
cost for full batching submission, such as this patch requires
one extra .commit_rqs() for each dispatch, and lock is often needed
in this callback.

That is not necessarily the case at all.

IMO it can be a win for some slow driver or device, but may cause
a little performance drop for fast driver/device especially in workload
of not-batching submission.

You're mostly correct. This is exactly why an I/O scheduler may be
applicable here IMO. Mostly because I/O schedulers tend to optimize for
something specific and always present tradeoffs. Users need to
understand what they are optimizing for.

Hence I'd say this functionality can definitely be available to an I/O
scheduler should one exist.