Re: regression caused by cgroups optimization in 3.17-rc2

From: Dave Hansen
Date: Tue Sep 02 2014 - 16:57:28 EST

Next message: Paul E. McKenney: "Re: rcu: Remove rcu_dynticks * parameters when they are always this_cpu_ptr(&rcu_dynticks)"
Previous message: Thomas Gleixner: "Re: [PATCH v3 3/4] irq: Allow multiple clients to register for irq affinity notification"
In reply to: Dave Hansen: "Re: regression caused by cgroups optimization in 3.17-rc2"
Next in thread: Michal Hocko: "Re: regression caused by cgroups optimization in 3.17-rc2"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I, of course, forgot to include the most important detail. This appears
to be pretty run-of-the-mill spinlock contention in the resource counter
code. Nearly 80% of the CPU is spent spinning in the charge or uncharge
paths in the kernel. It is apparently spinning on res_counter->lock in
both the charge and uncharge paths.

It already does _some_ batching here on the free side, but that
apparently breaks down after ~40 threads.

It's a no-brainer since the patch in question removed an optimization
skipping the charging, and now we're seeing overhead from the charging.

Here's the first entry from perf top:

80.18% 80.18% [kernel] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--66.59%-- res_counter_uncharge_until
| res_counter_uncharge
| uncharge_batch
| uncharge_list
| mem_cgroup_uncharge_list
| release_pages
| free_pages_and_swap_cache
| tlb_flush_mmu_free
| |
| |--90.12%-- unmap_single_vma
| | unmap_vmas
| | unmap_region
| | do_munmap
| | vm_munmap
| | sys_munmap
| | system_call_fastpath
| | __GI___munmap
| |
| --9.88%-- tlb_flush_mmu
| tlb_finish_mmu
| unmap_region
| do_munmap
| vm_munmap
| sys_munmap
| system_call_fastpath
| __GI___munmap
|
|--46.13%-- __res_counter_charge
| res_counter_charge
| try_charge
| mem_cgroup_try_charge
| |
| |--99.89%-- do_cow_fault
| | handle_mm_fault
| | __do_page_fault
| | do_page_fault
| | page_fault
| | testcase
| --0.11%-- [...]
|
|--1.14%-- do_cow_fault
| handle_mm_fault
| __do_page_fault
| do_page_fault
| page_fault
| testcase
--8217937613.29%-- [...]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Paul E. McKenney: "Re: rcu: Remove rcu_dynticks * parameters when they are always this_cpu_ptr(&rcu_dynticks)"
Previous message: Thomas Gleixner: "Re: [PATCH v3 3/4] irq: Allow multiple clients to register for irq affinity notification"
In reply to: Dave Hansen: "Re: regression caused by cgroups optimization in 3.17-rc2"
Next in thread: Michal Hocko: "Re: regression caused by cgroups optimization in 3.17-rc2"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]