Re: [BUGFIX][PATCH 0/4] Fixes for memcg with THP

From: KAMEZAWA Hiroyuki
Date: Sun Jan 30 2011 - 19:02:26 EST


On Sat, 29 Jan 2011 18:17:56 +0530
Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:

> On Fri, Jan 28, 2011 at 8:52 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> >
> > On recent -mm, when I run make -j 8 under 200M limit of memcg, as
> > ==
> > # mount -t cgroup none /cgroup/memory -o memory
> > # mkdir /cgroup/memory/A
> > # echo 200M > /cgroup/memory/A/memory.limit_in_bytes
> > # echo $$ > /cgroup/memory/A/tasks
> > # make -j 8 kernel
> > ==
> >
> > I see hangs with khugepaged. That's because memcg's memory reclaim
> > routine doesn't handle HUGE_PAGE request in proper way. And khugepaged
> > doesn't know about memcg.
> >
> > This patch set is for fixing above hang. Patch 1-3 seems obvious and
> > has the same concept as patches in RHEL.
>
> Do you have any backtraces? Are they in the specific patches?
>

Jan 18 10:28:29 rhel6-test kernel: [56245.286007] INFO: rcu_sched_state detected stall on CPU 0
(t=60000 jiffies)
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] sending NMI to all CPUs:
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] NMI backtrace for cpu 0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] CPU 0

Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8102a04e>] arch_trigger_all_cpu_bac
ktrace+0x5e/0xa0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff810bca09>] __rcu_pending+0x169/0x3b
0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8108a250>] ? tick_sched_timer+0x0/0
xc0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff810bccbc>] rcu_check_callbacks+0x6c
/0x120
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff810689a8>] update_process_times+0x4
8/0x90
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8108a2b6>] tick_sched_timer+0x66/0x
c0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8107ede0>] __run_hrtimer+0x90/0x1e0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff81032db9>] ? kvm_clock_get_cycles+0
x9/0x10
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8107f1be>] hrtimer_interrupt+0xde/0
x240
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8155268b>] smp_apic_timer_interrupt
+0x6b/0x9b
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8100c9d3>] apic_timer_interrupt+0x13/0x20
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] <EOI>
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff810a726a>] ? res_counter_charge+0xda/0x100
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff81145459>] __mem_cgroup_try_charge+0x199/0x5d0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff811463b5>] mem_cgroup_newpage_charge+0x45/0x50
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8113dbd4>] khugepaged+0x924/0x1430
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8107af00>] ? autoremove_wake_function+0x0/0x40
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8113d2b0>] ? khugepaged+0x0/0x1430
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8107a8b6>] kthread+0x96/0xa0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8107a820>] ? kthread+0x0/0xa0
Jan 18 10:28:29 rhel6-test kernel: [56245.286007] [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/