On Mon, May 29, 2023 at 4:50 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
Hi,
在 2023/05/29 15:57, Xiao Ni 写道:
On Mon, May 29, 2023 at 11:18 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
Hi,
在 2023/05/29 11:10, Xiao Ni 写道:
On Mon, May 29, 2023 at 10:20 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
Hi,
在 2023/05/29 10:08, Xiao Ni 写道:
Hi Kuai
There is a limitation of the memory in your test. But for most
situations, customers should not set this. Can this change introduce a
performance regression against other situations?
Noted that this limitation is just to triggered writeback as soon as
possible in the test, and it's 100% sure real situations can trigger
dirty pages write back asynchronously and continue to produce new dirty
pages.
Hi
I'm confused here. If we want to trigger write back quickly, it needs
to set these two values with a smaller number, rather than 0 and 60.
Right?
60 is not required, I'll remove this setting.
0 just means write back if there are any dirty pages.
Hi Kuai
Does 0 mean disabling write back? I tried to find the doc that
describes the meaning when setting dirty_background_ratio to 0, but I
didn't find it.
In https://www.kernel.org/doc/html/next/admin-guide/sysctl/vm.html it
doesn't describe this. But it says something like this
Note:
dirty_background_bytes is the counterpart of dirty_background_ratio. Only
one of them may be specified at a time. When one sysctl is written it is
immediately taken into account to evaluate the dirty memory limits and the
other appears as 0 when read.
Maybe you can specify dirty_background_ratio to 1 if you want to
trigger write back ASAP.
The purpose here is to trigger write back ASAP, I'm not an expert here,
but based on test result, 0 obviously doesn't mean disable write back.
Set dirty_background_bytes to a value, dirty_background_ratio will be
set to 0 together, which means dirty_background_ratio is disabled.
However, change dirty_background_ratio from default value to 0, will end
up both dirty_background_ratio and dirty_background_bytes to be 0, and
based on following related code, I think 0 just means write back if
there are any dirty pages.
domain_dirty_limits:
bg_bytes = dirty_background_bytes -> 0
bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100 -> 0
if (bg_bytes)
bg_thresh = DIV_ROUND_UP(bg_bytes, PAGE_SIZE);
else
bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE; -> 0
dtc->bg_thresh = bg_thresh; -> 0
balance_dirty_pages
nr_reclaimable = global_node_page_state(NR_FILE_DIRTY);
if (!laptop_mode && nr_reclaimable > gdtc->bg_thresh &&
!writeback_in_progress(wb))
wb_start_background_writeback(wb); -> writeback ASAP
Thanks,
Kuai
Hi Kuai
I'm not an expert about this either. Thanks for all your patches, I
can study more things too. But I still have some questions.
I did a test in my environment something like this:
modprobe brd rd_nr=4 rd_size=10485760
mdadm -CR /dev/md0 -l10 -n4 /dev/ram[0123] --assume-clean
echo 0 > /proc/sys/vm/dirty_background_ratio
fio -filename=/dev/md0 -ioengine=libaio -rw=write -thread -bs=1k-8k
-numjobs=1 -iodepth=128 --runtime=10 -name=xxx
It will cause OOM and the system hangs
modprobe brd rd_nr=4 rd_size=10485760
mdadm -CR /dev/md0 -l10 -n4 /dev/ram[0123] --assume-clean
echo 1 > /proc/sys/vm/dirty_background_ratio (THIS is the only different place)
fio -filename=/dev/md0 -ioengine=libaio -rw=write -thread -bs=1k-8k
-numjobs=1 -iodepth=128 --runtime=10 -name=xxx
It can finish successfully. The value of dirty_background_ration is 1
here means it flushes ASAP
So your method should be the opposite way as you designed. All the
memory can't be flushed in time, so it uses all memory very soon and
the memory runs out and the system hangs. The reason I'm looking at
the test is that do we really need this change. Because in the real
world, most customers don't disable write back. Anyway, it depends on
Song's decision and thanks for your patches again. I'll review V3 and
try to do some performance tests.
Best Regards
Xiao