Re: [PATCH v1 1/7] sched/isolation: Add infrastructure to adjust affinity for dynamic CPU isolation

From: Thomas Gleixner
Date: Fri May 17 2024 - 17:39:43 EST


On Thu, May 16 2024 at 22:04, Costa Shulyupin wrote:
> Introduce infrastructure function housekeeping_update() to change
> housekeeping_cpumask during runtime and adjust affinities of depended
> subsystems.
>
> Affinity adjustments of subsystems follow in subsequent patches.
>
> Parent patch:
> "sched/isolation: Exclude dynamically isolated CPUs from housekeeping masks"
> https://lore.kernel.org/lkml/20240229021414.508972-2-longman@xxxxxxxxxx/
>
> Test example for cgroup2:
>
> cd /sys/fs/cgroup/
> echo +cpuset > cgroup.subtree_control
> mkdir test
> echo isolated > test/cpuset.cpus.partition
> echo $isolate > test/cpuset.cpus

This changelog is not telling me anything. Please see
Documentation/process/ what changelogs should contain.

> +/*
> + * housekeeping_update - change housekeeping.cpumasks[type] and propagate the
> + * change.
> + *
> + * Assuming cpuset_mutex is held in sched_partition_write or
> + * cpuset_write_resmask.

Locking cannot be assumed. lockdep_assert_held() is there to document
and enforce such requirements.

> + */
> +static int housekeeping_update(enum hk_type type, cpumask_var_t update)

Please us 'struct cpumask *update' as it makes it clear what this is
about. cpumask_var_t is a hack to make onstack and embedded cpumask and
their allocated counterparts possible without #ifdeffery in the code.

But any function which is not related to alloc/free of cpumask_var_t
should simply use 'struct cpumask *' as argument type.

> + housekeeping.flags |= BIT(type);

The existing code uses WRITE_ONCE() probably for a reason. Why is that
not longer required here?

> static int __init housekeeping_setup(char *str, unsigned long flags)
> {
> cpumask_var_t non_housekeeping_mask, housekeeping_staging;
> @@ -314,9 +347,12 @@ int housekeeping_exlude_isolcpus(const struct cpumask *isolcpus, unsigned long f
> /*
> * Reset housekeeping to bootup default
> */
> - for_each_set_bit(type, &housekeeping_boot.flags, HK_TYPE_MAX)
> - cpumask_copy(housekeeping.cpumasks[type],
> - housekeeping_boot.cpumasks[type]);
> + for_each_set_bit(type, &housekeeping_boot.flags, HK_TYPE_MAX) {
> + int err = housekeeping_update(type, housekeeping_boot.cpumasks[type]);
> +
> + if (err)
> + return err;
> + }
>
> WRITE_ONCE(housekeeping.flags, housekeeping_boot.flags);
> if (!housekeeping_boot.flags &&
> @@ -344,9 +380,11 @@ int housekeeping_exlude_isolcpus(const struct cpumask *isolcpus, unsigned long f
> cpumask_andnot(tmp_mask, src_mask, isolcpus);
> if (!cpumask_intersects(tmp_mask, cpu_online_mask))
> return -EINVAL; /* Invalid isolated CPUs */
> - cpumask_copy(housekeeping.cpumasks[type], tmp_mask);
> + int err = housekeeping_update(type, tmp_mask);
> +
> + if (err)
> + return err;

Do we really need two places to define 'int err' or might it be possible
to have one instance defined at function scope?

Thanks,

tglx