Re: Write throughput impaired by touching dirty_ratio
From: Vlastimil Babka
Date: Wed Jun 24 2015 - 04:28:06 EST
[add some CC's]
On 06/19/2015 05:16 PM, Mark Hills wrote:
> I noticed that any change to vm.dirty_ratio causes write throuput to
> plummet -- to around 5Mbyte/sec.
>
> <system bootup, kernel 4.0.5>
>
> # dd if=/dev/zero of=/path/to/file bs=1M
>
> # sysctl vm.dirty_ratio
> vm.dirty_ratio = 20
> <all ok; writes at ~150Mbyte/sec>
>
> # sysctl vm.dirty_ratio=20
> <all continues to be ok>
>
> # sysctl vm.dirty_ratio=21
> <writes drop to ~5Mbyte/sec>
>
> # sysctl vm.dirty_ratio=20
> <writes continue to be slow at ~5Mbyte/sec>
>
> The test shows that return to the previous value does not restore the old
> behaviour. I return the system to usable state with a reboot.
>
> Reads continue to be fast and are not affected.
>
> A quick look at the code suggests differing behaviour from
> writeback_set_ratelimit on startup. And that some of the calculations (eg.
> global_dirty_limit) is badly behaved once the system has booted.
Hmm, so the only thing that dirty_ratio_handler() changes except the
vm_dirty_ratio itself, is ratelimit_pages through writeback_set_ratelimit(). So
I assume the problem is with ratelimit_pages. There's num_online_cpus() used in
the calculation, which I think would differ between the initial system state
(where we are called by page_writeback_init()) and later when all CPU's are
onlined. But I don't see CPU onlining code updating the limit (unlike memory
hotplug which does that), so that's suspicious.
Another suspicious thing is that global_dirty_limits() looks at current
process's flag. It seems odd to me that the process calling the sysctl would
determine a value global to the system.
If you are brave enough (and have kernel configured properly and with
debuginfo), you can verify how value of ratelimit_pages variable changes on the
live system, using the crash tool. Just start it, and if everything works, you
can inspect the live system. It's a bit complicated since there are two static
variables called "ratelimit_pages" in the kernel so we can't print them easily
(or I don't know how). First we have to get the variable address:
crash> sym ratelimit_pages
ffffffff81e67200 (d) ratelimit_pages
ffffffff81ef4638 (d) ratelimit_pages
One will be absurdly high (probably less on your 32bit) so it's not the one we want:
crash> rd -d ffffffff81ef4638 1
ffffffff81ef4638: 4294967328768
The second will have a smaller value:
(my system after boot with dirty ratio = 20)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200: 1577
(after changing to 21)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200: 1570
(after changing back to 20)
crash> rd -d ffffffff81e67200 1
ffffffff81e67200: 1496
So yes, it does differ but not drastically. A difference between 1 and 8 online
CPU's would look differently I think. So my theory above is questionable. But
you might try what it looks like on your system...
>
> The system is an HP xw6600, running i686 kernel. This happens whether
> internal SATA HDD, SSD or external USB drive is used. I first saw this on
> kernel 4.0.4, and 4.0.5 is also affected.
So what was the last version where you did change the dirty ratio and it worked
fine?
>
> It would suprise me if I'm the only person who was setting dirty_ratio.
>
> Have others seen this behaviour? Thanks
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/