Re: [PATCH v6 3/3] blk-cgroup: Optimize blkcg_rstat_flush()

From: Waiman Long
Date: Sun Jun 05 2022 - 22:03:37 EST

Next message: Matthew Wilcox: "Re: [PATCH v2 1/3] mm/shmem: check return value of shmem_init_inodecache"
Previous message: Peng Fan (OSS): "[PATCH V3 7/7] arm64: dts: imx93: add mediamix blk ctrl node"
In reply to: Ming Lei: "Re: [PATCH v6 3/3] blk-cgroup: Optimize blkcg_rstat_flush()"
Next in thread: Ming Lei: "Re: [PATCH v6 3/3] blk-cgroup: Optimize blkcg_rstat_flush()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 6/5/22 21:39, Ming Lei wrote:

On Sun, Jun 05, 2022 at 07:15:27PM -0400, Waiman Long wrote:

On 6/3/22 23:58, Ming Lei wrote:

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index abec50f31fe6..8c4f204dbf5b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -622,7 +622,7 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val)
{
unsigned int x;
- cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id());
+ cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id(), NULL);
x = __this_cpu_add_return(stats_updates, abs(val));
if (x > MEMCG_CHARGE_BATCH) {

I think the rstat set of functions are doing that already. So flush will
only call CPUs that have called cgroup_rstat_updated() before. However, one

Yeah, I guess the detail is in cgroup_rstat_cpu_pop_updated(), but the
percpu lock(raw_spin_lock_irqsave) is still required, and cgroup_rstat_cpu_pop_updated()
is still called even through there isn't any update on this CPU.

Yes, I think we may need to add a bitmask of what controllers have updates in cgroup_rstat_cpu structure.

deficiency that I am aware of is that there is no bitmap of which controller
have update. The problem that I saw in cgroup v2 is that in a cgroup with
both memory controller and block controller enabled, a
cgroup_rstat_updated() call from memory cgroup later causes the rstat
function to call into block cgroup flush method even though there is no
update in the block controller. This is an area that needs improvement.

Your code does allow the block controller to be aware of that and avoid
further action, but I think it has to be done in the rstat code to be
applicable to all controllers instead of just specific to block controller.

I guess it can be done by adding one percpu variable to 'struct cgroup'.

There is another problem that this approach. Suppose the system have 20
block devices and one of them has an IO operation. Now the flush method
still needs to iterate all the 20 blkg's to do an update. The block
controller is kind of special that the number of per-cgroup IO stats depends
on the number of block devices present. Other controllers just have one set
of stats per cgroup.

Yeah, and this one is really blkio specific issue, and your patch does
cover this one. Maybe you can add one callback to
cgroup_rstat_updated(), so the "blkg_iostat_set" instance is added into
percpu list under percpu lock of cgroup_rstat_cpu_lock, then the lockless
list isn't needed.

The rstat API is generic. It may not be a good idea to put controller specific information into it. Yes, cgroup_rstat_cpu_lock is taken at the read side (flush). It may not taken on the write side (update). So it may not be easy to rely on this lock for synchronization between the read and write side.

Cheers,
Longman

Next message: Matthew Wilcox: "Re: [PATCH v2 1/3] mm/shmem: check return value of shmem_init_inodecache"
Previous message: Peng Fan (OSS): "[PATCH V3 7/7] arm64: dts: imx93: add mediamix blk ctrl node"
In reply to: Ming Lei: "Re: [PATCH v6 3/3] blk-cgroup: Optimize blkcg_rstat_flush()"
Next in thread: Ming Lei: "Re: [PATCH v6 3/3] blk-cgroup: Optimize blkcg_rstat_flush()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]