Re: [PATCH] iosched: Add i10 I/O Scheduler

From: Ming Lei
Date: Mon Nov 16 2020 - 03:41:40 EST


On Fri, Nov 13, 2020 at 01:36:16PM -0800, Sagi Grimberg wrote:
>
> > > But if you think this has a better home, I'm assuming that the guys
> > > will be open to that.
> >
> > Also see the reply from Ming. It's a balancing act - don't want to add
> > extra overhead to the core, but also don't want to carry an extra
> > scheduler if the main change is really just variable dispatch batching.
> > And since we already have a notion of that, seems worthwhile to explore
> > that venue.
>
> I agree,
>
> The main difference is that this balancing is not driven from device
> resource pressure, but rather from an assumption of device specific
> optimization (and also with a specific optimization target), hence a
> scheduler a user would need to opt-in seemed like a good compromise.
>
> But maybe Ming has some good ideas on a different way to add it..

Not yet, :-(

It is one very good work to show that IO is improved with batching.

One big question I am still not clear is that how NVMe-TCP performance(
should be throughput according to 'Introduction' part of paper[1]) is
improved much when batching IO is applied. Is it because network stack
performs much well for transporting big chunk data? Or context switch overhead is
reduced because 'Ringing the doorbell' implies worker queue scheduling,
according to '2.4 Delayed Doorbells' of [1]. Or both? Or others? Do we
have data wrt. how much improvement from each factor?

Another question is that 'Introduction' of [1] part mentions that i10 is
more for 'throughput-bound applications'. And 'at low loads, latencies
may be high(within 1.7× of NVMe-over-RDMA latency over storage devices))',
so i10 scheduler is primarily for throughput-bound applications? If yes,
I'd suggest to add the words to commit log for helping people to review.
Then we can avoid to consider IO latency sensitive usages(such as iopoll).

[1] https://www.usenix.org/conference/nsdi20/presentation/hwang

Thanks,
Ming