[PATCH 4.11 105/114] cgroup: fix spurious warnings on cgroup_is_dead() from cgroup_sk_alloc()

From: Greg Kroah-Hartman
Date: Thu May 18 2017 - 06:53:44 EST

Next message: Greg Kroah-Hartman: "[PATCH 4.11 109/114] libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify"
Previous message: Greg Kroah-Hartman: "[PATCH 4.11 085/114] dax: fix PMD data corruption when fault races with write"
In reply to: Greg Kroah-Hartman: "[PATCH 4.11 085/114] dax: fix PMD data corruption when fault races with write"
Next in thread: Greg Kroah-Hartman: "[PATCH 4.11 109/114] libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

4.11-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tejun Heo <tj@xxxxxxxxxx>

commit a590b90d472f2c176c140576ee3ab44df7f67839 upstream.

cgroup_get() expected to be called only on live cgroups and triggers
warning on a dead cgroup; however, cgroup_sk_alloc() may be called
while cloning a socket which is left in an empty and removed cgroup
and thus may legitimately duplicate its reference on a dead cgroup.
This currently triggers the following warning spuriously.

WARNING: CPU: 14 PID: 0 at kernel/cgroup.c:490 cgroup_get+0x55/0x60
...
[<ffffffff8107e123>] __warn+0xd3/0xf0
[<ffffffff8107e20e>] warn_slowpath_null+0x1e/0x20
[<ffffffff810ff465>] cgroup_get+0x55/0x60
[<ffffffff81106061>] cgroup_sk_alloc+0x51/0xe0
[<ffffffff81761beb>] sk_clone_lock+0x2db/0x390
[<ffffffff817cce06>] inet_csk_clone_lock+0x16/0xc0
[<ffffffff817e8173>] tcp_create_openreq_child+0x23/0x4b0
[<ffffffff818601a1>] tcp_v6_syn_recv_sock+0x91/0x670
[<ffffffff817e8b16>] tcp_check_req+0x3a6/0x4e0
[<ffffffff81861ba3>] tcp_v6_rcv+0x693/0xa00
[<ffffffff81837429>] ip6_input_finish+0x59/0x3e0
[<ffffffff81837cb2>] ip6_input+0x32/0xb0
[<ffffffff81837387>] ip6_rcv_finish+0x57/0xa0
[<ffffffff81837ac8>] ipv6_rcv+0x318/0x4d0
[<ffffffff817778c7>] __netif_receive_skb_core+0x2d7/0x9a0
[<ffffffff81777fa6>] __netif_receive_skb+0x16/0x70
[<ffffffff81778023>] netif_receive_skb_internal+0x23/0x80
[<ffffffff817787d8>] napi_gro_frags+0x208/0x270
[<ffffffff8168a9ec>] mlx4_en_process_rx_cq+0x74c/0xf40
[<ffffffff8168b270>] mlx4_en_poll_rx_cq+0x30/0x90
[<ffffffff81778b30>] net_rx_action+0x210/0x350
[<ffffffff8188c426>] __do_softirq+0x106/0x2c7
[<ffffffff81082bad>] irq_exit+0x9d/0xa0 [<ffffffff8188c0e4>] do_IRQ+0x54/0xd0
[<ffffffff8188a63f>] common_interrupt+0x7f/0x7f <EOI>
[<ffffffff8173d7e7>] cpuidle_enter+0x17/0x20
[<ffffffff810bdfd9>] cpu_startup_entry+0x2a9/0x2f0
[<ffffffff8103edd1>] start_secondary+0xf1/0x100

This patch renames the existing cgroup_get() with the dead cgroup
warning to cgroup_get_live() after cgroup_kn_lock_live() and
introduces the new cgroup_get() which doesn't check whether the cgroup
is live or dead.

All existing cgroup_get() users except for cgroup_sk_alloc() are
converted to use cgroup_get_live().

Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Reported-by: Chris Mason <clm@xxxxxx>
Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
kernel/cgroup/cgroup.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)

--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -438,6 +438,11 @@ out_unlock:

static void cgroup_get(struct cgroup *cgrp)
{
+ css_get(&cgrp->self);
+}
+
+static void cgroup_get_live(struct cgroup *cgrp)
+{
WARN_ON_ONCE(cgroup_is_dead(cgrp));
css_get(&cgrp->self);
}
@@ -932,7 +937,7 @@ static void link_css_set(struct list_hea
list_add_tail(&link->cgrp_link, &cset->cgrp_links);

if (cgroup_parent(cgrp))
- cgroup_get(cgrp);
+ cgroup_get_live(cgrp);
}

/**
@@ -1802,7 +1807,7 @@ static struct dentry *cgroup_mount(struc
return ERR_PTR(-EINVAL);
}
cgrp_dfl_visible = true;
- cgroup_get(&cgrp_dfl_root.cgrp);
+ cgroup_get_live(&cgrp_dfl_root.cgrp);

dentry = cgroup_do_mount(&cgroup2_fs_type, flags, &cgrp_dfl_root,
CGROUP2_SUPER_MAGIC, ns);
@@ -2576,7 +2581,7 @@ restart:
if (!css || !percpu_ref_is_dying(&css->refcnt))
continue;

- cgroup_get(dsct);
+ cgroup_get_live(dsct);
prepare_to_wait(&dsct->offline_waitq, &wait,
TASK_UNINTERRUPTIBLE);

@@ -3947,7 +3952,7 @@ static void init_and_link_css(struct cgr
{
lockdep_assert_held(&cgroup_mutex);

- cgroup_get(cgrp);
+ cgroup_get_live(cgrp);

memset(css, 0, sizeof(*css));
css->cgroup = cgrp;
@@ -4123,7 +4128,7 @@ static struct cgroup *cgroup_create(stru
/* allocation complete, commit to creation */
list_add_tail_rcu(&cgrp->self.sibling, &cgroup_parent(cgrp)->self.children);
atomic_inc(&root->nr_cgrps);
- cgroup_get(parent);
+ cgroup_get_live(parent);

/*
* @cgrp is now fully operational. If something fails after this
@@ -4947,7 +4952,7 @@ struct cgroup *cgroup_get_from_path(cons
if (kn) {
if (kernfs_type(kn) == KERNFS_DIR) {
cgrp = kn->priv;
- cgroup_get(cgrp);
+ cgroup_get_live(cgrp);
} else {
cgrp = ERR_PTR(-ENOTDIR);
}
@@ -5027,6 +5032,11 @@ void cgroup_sk_alloc(struct sock_cgroup_

/* Socket clone path */
if (skcd->val) {
+ /*
+ * We might be cloning a socket which is left in an empty
+ * cgroup and the cgroup might have already been rmdir'd.
+ * Don't use cgroup_get_live().
+ */
cgroup_get(sock_cgroup_ptr(skcd));
return;
}

Next message: Greg Kroah-Hartman: "[PATCH 4.11 109/114] libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify"
Previous message: Greg Kroah-Hartman: "[PATCH 4.11 085/114] dax: fix PMD data corruption when fault races with write"
In reply to: Greg Kroah-Hartman: "[PATCH 4.11 085/114] dax: fix PMD data corruption when fault races with write"
Next in thread: Greg Kroah-Hartman: "[PATCH 4.11 109/114] libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]