Re: [PATCH v4 3/4] swap: apply new pw_queue_on() interface

From: Sebastian Andrzej Siewior

Date: Wed May 20 2026 - 11:49:14 EST


On 2026-05-18 22:27:49 [-0300], Leonardo Bras wrote:

after digesting the slub patch,

> @@ -882,38 +879,38 @@ static inline void __lru_add_drain_all(bool force_all_cpus)
> * If the paired barrier is done at any later step, e.g. after the
> * loop, CPU #x will just exit at (C) and miss flushing out all of its
> * added pages.
> */
> WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1);
> smp_mb();
>
> cpumask_clear(&has_mm_work);
> cpumask_clear(&has_bh_work);
> for_each_online_cpu(cpu) {
> - struct work_struct *mm_work = &per_cpu(lru_add_drain_work, cpu);
> + struct pw_struct *mm_pw = &per_cpu(lru_add_drain_pw, cpu);
> struct work_struct *bh_work = &per_cpu(bh_add_drain_work, cpu);
>
> if (cpu_needs_mm_drain(cpu)) {
> - INIT_WORK(mm_work, lru_add_drain_per_cpu);
> - queue_work_on(cpu, mm_percpu_wq, mm_work);
> + INIT_PW(mm_pw, lru_add_drain_per_cpu, cpu);
> + pw_queue_on(cpu, mm_percpu_wq, mm_pw);
> __cpumask_set_cpu(cpu, &has_mm_work);
> }
>
> if (cpu_needs_bh_drain(cpu)) {
> INIT_WORK(bh_work, bh_add_drain_per_cpu);
> queue_work_on(cpu, mm_percpu_wq, bh_work);
> __cpumask_set_cpu(cpu, &has_bh_work);
> }
> }
>
> for_each_cpu(cpu, &has_mm_work)
> - flush_work(&per_cpu(lru_add_drain_work, cpu));
> + pw_flush(&per_cpu(lru_add_drain_pw, cpu));
>
> for_each_cpu(cpu, &has_bh_work)
> flush_work(&per_cpu(bh_add_drain_work, cpu));

Why do we have two iterations here? Is it just a proof of concept that
is not complete yet? I am curious why it is okay/needed to "remove" the
one workqueue but not the other. Maybe the other does not bother as much
as the other does.

But essentially we can't use a spin_lock_t here because due to the
hotpath nature of the code it will kill performance. So instead we do it
anyway but behind a switch so that only those suffer from this that do
not want to suffer from workqueue interruption on a NOHZ full system,
right?

I thought that this improved since commit
ff042f4a9b050 ("mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu")

Did it get worse or was it not entirely gone?

> done:
> mutex_unlock(&lock);
> }
>
> void lru_add_drain_all(void)
> {

Sebastian