Re: [BUG] kernel panic after bpf program removed.

From: Wangnan (F)
Date: Fri May 15 2015 - 05:21:16 EST




å 2015/5/15 13:37, Alexei Starovoitov åé:
On 5/14/15 8:54 PM, Wangnan (F) wrote:
Hi Alexei Starovoitov and other,

I triggered a kernel panic when developing my 'perf bpf' facility. The
call stack is listed at the bottom of
this mail.

I attached two bpf programs on 'kmem_cache_free%return' and
'__alloc_pages_nodemask'. The programs is very simple.
The panic is raised after closing the bpf program and the perf event
file. Looks like the panic is caused
by racing between closing perf event fd and bpf program fd. I'm unable
to reproduce this problem with similar
operations.

Following is the exact instruction cause the panic.

thanks for the report.
Looks like pointer 'prog == 0x6c0' is passed into bpf_prog_put,
which means that event->tp_event was freed and memory reused before
free_event_rcu() was called.

I think it's not perf_event_fd racing with prog_fd, but rather
with kprobe freeing:
__free_event()
event->destroy(event)
perf_trace_destroy
perf_trace_event_unreg
which is dropping event->tp_event->perf_refcount
that allows kprobe freeing to proceed in:
unregister_kprobe_event
trace_remove_event_call
probe_remove_event_call
and eventually tp_event to get freed.

I think calling perf_event_free_bpf_prog()
from __free_event() instead of free_event_rcu() will fix the race,
but please double check my analysis.
Also please send me a reproducer script. I'd like to see it crashing
first before the fix and not crashing afterwards.


I triggered the problem with my 'perf bpf' patch series, and reproduced once.

The bpf program is attached.

What I do is to use

# perf bpf record --object /root/sample_bpf_program.o -- sleep 4

to start recording, then press C-c before sleep finish after about 3 seconds.

The second call trace is identical to the previous one.

My environment is qemu with v4.1-rc3 kernel.

Thank you.

-------------------------------------------------
#include <uapi/linux/bpf.h>
#include <linux/version.h>
#include <uapi/linux/ptrace.h>

#define SEC(NAME) __attribute__((section(NAME), used))

static int (*bpf_map_delete_elem)(void *map, void *key) =
(void *) BPF_FUNC_map_delete_elem;
static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
(void *) BPF_FUNC_trace_printk;

struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
};

struct pair {
u64 val;
u64 ip;
};

struct bpf_map_def SEC("maps") my_map = {
.type = BPF_MAP_TYPE_HASH,
.key_size = sizeof(long),
.value_size = sizeof(struct pair),
.max_entries = 1000000,
};

struct bpf_map_def SEC("maps") my_map2 = {
.type = BPF_MAP_TYPE_HASH,
.key_size = sizeof(long),
.value_size = sizeof(struct pair),
.max_entries = 1000000,
};

SEC("cache_free=kmem_cache_free%return")
int bpf_prog1(struct pt_regs *ctx)
{
long ptr = ctx->r14;
bpf_map_delete_elem(&my_map2, &ptr);
return 0;
}

SEC("mybpfprog=__alloc_pages_nodemask")
int bpf_prog_my(struct pt_regs *ctx)
{
char fmt[] = "Haha\n";

long ptr = ctx->r14;
bpf_trace_printk(fmt, sizeof(fmt));
bpf_map_delete_elem(&my_map, &ptr);
return 0;
}

char _license[] SEC("license") = "GPL";
u32 _version SEC("version") = LINUX_VERSION_CODE;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/