Re:Re: [BUG?] bcachefs: keep writing to device when there is no high-level I/O activity.

From: David Wang
Date: Thu Aug 29 2024 - 23:08:35 EST


Hi,
At 2024-08-28 00:17:12, "Kent Overstreet" <kent.overstreet@xxxxxxxxx> wrote:
>On Tue, Aug 27, 2024 at 05:49:33PM GMT, David Wang wrote:
>> Hi,
>>
>> I was using two partitions on same nvme device to compare filesystem performance,
>> and I consistantly observed a strange behavior:
>>
>> After 10 minutes fio test with bcachefs on one partition, performance degrade
>> significantly for other filesystems on other partition (same device).
>>
>> ext4 150M/s --> 143M/s
>> xfs 150M/s --> 134M/s
>> btrfs 127M/s --> 108M/s
>>
>> Several round tests show the same pattern that bcachefs seems occupy some device resource
>> even when there is no high-level I/O.
>
>This is is a known issue, it should be either journal reclaim or
>rebalance.
>
>(We could use some better stats to see exactly which it is)
>


I kprobe bch2_submit_wbio_replicas and then bch2_btree_node_write, confirmed that
the background writes were from bch2_journal_reclaim_thread.
(And then, by skimming the code in __bch2_journal_reclaim, I noticed those trace_and_count stats)



>The algorithm for how we do background work needs to change; I've
>written up a new one but I'm a ways off from having time to implement it
>
>https://evilpiepirate.org/git/bcachefs.git/commit/?h=bcachefs-garbage&id=47a4b574fb420aa824aad222436f4c294daf66ae
>
>Could be a fun one for someone new to take on.
>
>>

A Fun and scary one....
For the issue in this thread,
I think *idle* should be defined to be device wide:
when bcachefs is idle while other FS on the same block device is busy, those background threads should be throttled to some degree.


Thanks
David