On Wed, Sep 18, 2019 at 11:21:04AM +0800, Lin Feng wrote:
Adding a new tunable is not the right solution. The right way is
to make Linux auto-tune itself to avoid the problem. For example,
bdi_writeback contains an estimated write bandwidth (calculated by the
memory management layer). Given that, we should be able to make an
estimate for how long to wait for the queues to drain.
Yes, I had ever considered that, auto-tuning is definitely the senior AI way.
While considering all kinds of production environments hybird storage solution
is also common today, servers' dirty pages' bdi drivers can span from high end
ssds to low end sata disk, so we have to think of a *formula(AI core)* by using
the factors of dirty pages' amount and bdis' write bandwidth, and this AI-core
will depend on if the estimated write bandwidth is sane and moreover the to be
written back dirty pages is sequential or random if the bdi is rotational disk,
it's likey to give a not-sane number and hurt guys who dont't want that, while
if only consider ssd is relatively simple.
So IMHO it's not sane to brute force add a guessing logic into memory writeback
codes and pray on inventing a formula that caters everyone's need.
Add a sysctl entry may be a right choice that give people who need it and
doesn't hurt people who don't want it.
You're making this sound far harder than it is. All the writeback code
needs to know is "How long should I sleep for in order for the queues
to drain a substantial amount". Since you know the bandwidth and how
many pages you've queued up, it's a simple calculation.