Re: [PATCH 04/10] memcg: disable local interrupts inlock_page_cgroup()

From: Minchan Kim
Date: Tue Oct 05 2010 - 12:03:46 EST


On Sun, Oct 03, 2010 at 11:57:59PM -0700, Greg Thelen wrote:
> If pages are being migrated from a memcg, then updates to that
> memcg's page statistics are protected by grabbing a bit spin lock
> using lock_page_cgroup(). In an upcoming commit memcg dirty page
> accounting will be updating memcg page accounting (specifically:
> num writeback pages) from softirq. Avoid a deadlocking nested
> spin lock attempt by disabling interrupts on the local processor
> when grabbing the page_cgroup bit_spin_lock in lock_page_cgroup().
> This avoids the following deadlock:
> statistic
> CPU 0 CPU 1
> inc_file_mapped
> rcu_read_lock
> start move
> synchronize_rcu
> lock_page_cgroup
> softirq
> test_clear_page_writeback
> mem_cgroup_dec_page_stat(NR_WRITEBACK)
> rcu_read_lock
> lock_page_cgroup /* deadlock */
> unlock_page_cgroup
> rcu_read_unlock
> unlock_page_cgroup
> rcu_read_unlock
>
> By disabling interrupts in lock_page_cgroup, nested calls
> are avoided. The softirq would be delayed until after inc_file_mapped
> enables interrupts when calling unlock_page_cgroup().
>
> The normal, fast path, of memcg page stat updates typically
> does not need to call lock_page_cgroup(), so this change does
> not affect the performance of the common case page accounting.
>
> Signed-off-by: Andrea Righi <arighi@xxxxxxxxxxx>
> Signed-off-by: Greg Thelen <gthelen@xxxxxxxxxx>
> ---
> include/linux/page_cgroup.h | 8 +++++-
> mm/memcontrol.c | 51 +++++++++++++++++++++++++-----------------
> 2 files changed, 36 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
> index b59c298..872f6b1 100644
> --- a/include/linux/page_cgroup.h
> +++ b/include/linux/page_cgroup.h
> @@ -117,14 +117,18 @@ static inline enum zone_type page_cgroup_zid(struct page_cgroup *pc)
> return page_zonenum(pc->page);
> }
>
> -static inline void lock_page_cgroup(struct page_cgroup *pc)
> +static inline void lock_page_cgroup(struct page_cgroup *pc,
> + unsigned long *flags)
> {
> + local_irq_save(*flags);
> bit_spin_lock(PCG_LOCK, &pc->flags);
> }

Hmm. Let me ask questions.

1. Why do you add new irq disable region in general function?
I think __do_fault is a one of fast path.

Could you disable softirq using _local_bh_disable_ not in general function
but in your context?
How do you expect that how many users need irq lock to update page state?
If they don't need to disalbe irq?

We can pass some argument which present to need irq lock or not.
But it seems to make code very ugly.

2. So could you solve the problem in your design?
I mean you could update page state out of softirq?
(I didn't look at the your patches all. Sorry if I am missing something)

3. Normally, we have updated page state without disable irq.
Why does memcg need it?

I hope we don't add irq disable region as far as possbile.

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/