Re: [PATCH] memcg: remove unneeded preempt_disable

From: Andrew Morton
Date: Thu Aug 18 2011 - 17:41:20 EST


(cc linux-arch)

On Wed, 17 Aug 2011 23:50:53 -0700
Greg Thelen <gthelen@xxxxxxxxxx> wrote:

> Both mem_cgroup_charge_statistics() and mem_cgroup_move_account() were
> unnecessarily disabling preemption when adjusting per-cpu counters:
> preempt_disable()
> __this_cpu_xxx()
> __this_cpu_yyy()
> preempt_enable()
>
> This change does not disable preemption and thus CPU switch is possible
> within these routines. This does not cause a problem because the total
> of all cpu counters is summed when reporting stats. Now both
> mem_cgroup_charge_statistics() and mem_cgroup_move_account() look like:
> this_cpu_xxx()
> this_cpu_yyy()
>
> ...
>
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -664,24 +664,20 @@ static unsigned long mem_cgroup_read_events(struct mem_cgroup *mem,
> static void mem_cgroup_charge_statistics(struct mem_cgroup *mem,
> bool file, int nr_pages)
> {
> - preempt_disable();
> -
> if (file)
> - __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
> + this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_CACHE], nr_pages);
> else
> - __this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
> + this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_RSS], nr_pages);
>
> /* pagein of a big page is an event. So, ignore page size */
> if (nr_pages > 0)
> - __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
> + this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGIN]);
> else {
> - __this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
> + this_cpu_inc(mem->stat->events[MEM_CGROUP_EVENTS_PGPGOUT]);
> nr_pages = -nr_pages; /* for event */
> }
>
> - __this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages);
> -
> - preempt_enable();
> + this_cpu_add(mem->stat->events[MEM_CGROUP_EVENTS_COUNT], nr_pages);
> }

On non-x86 architectures this_cpu_add() internally does
preempt_disable() and preempt_enable(). So the patch is a small
optimisation for x86 and a larger deoptimisation for non-x86.

I think I'll apply it, as the call frequency is low (correct?) and the
problem will correct itself as other architectures implement their
atomic this_cpu_foo() operations.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/