So for devices with write cache, you will completely drain the device
before waking anybody waiting to issue new requests. Isn't it too strict?
In particular may_queue() will allow new writers to issue new writes once
we drop below the limit so it can happen that some processes will be
effectively starved waiting in may_queue?
It is strict, and perhaps too strict. In testing, it's the only method
that's proven to keep the writeback caching devices in check. It will
round robin the writers, if we have more, which isn't necessarily a bad
thing. Each will get to do a burst of depth writes, then wait for a new
one.
Well, I'm more concerned about a situation where one writer does a
bursty write and blocks sleeping in may_queue(). Another writer
produces a steady flow of write requests so that never causes the
write queue to completely drain but that writer also never blocks in
may_queue() when it starts queueing after write queue has somewhat
drained because it never submits many requests in parallel. In such
case the first writer would get starved AFAIU.
Also I'm not sure why such logic for devices with writeback cache is
needed. Sure the disk is fast to accept writes but if that causes long
read latencies, we should scale down the writeback limits so that we
eventually end up submitting only one write request anyway -
effectively the same thing as limit=0 - won't we?