Re: unregister_netdevice: waiting for DEV to become free (2)

From: David Ahern
Date: Mon Apr 29 2019 - 14:43:22 EST


On 4/29/19 12:34 PM, David Ahern wrote:
> On 4/27/19 10:22 PM, Tetsuo Handa wrote:
>> On 2019/04/28 8:52, Eric Dumazet wrote:
>>> On 4/27/19 3:33 PM, Tetsuo Handa wrote:
>>>>
>>>> I'm waiting for davem why it is safe to move the dst entry from
>>>> "a device to unregister" to "a loopback device in that namespace".
>>>> I'm waiting for an explanation how the dst entry which was moved to
>>>> "a loopback device in that namespace" is released (i.e. what the
>>>> expected shutdown sequence is).
>>>
>>> The most probable explanation is that we make sure the loopback device
>>> is the last one to be dismantled at netns deletion,
>>> and this would obviously happen after all dst have been released.
>>>
>>
>> rt_flush_dev() becomes a no-op if "dev" == "a loopback device in that
>> namespace". And according to debug printk(), rt_flush_dev() is called
>> on "a loopback device in that namespace" itself.
>>
>> If "a loopback device in that namespace" is the last "one" (== "a network
>> device in that namespace" ?), which shutdown sequence should have called
>> dev_put("a loopback device in that namespace") before unregistration of
>> "a loopback device in that namespace" starts?
>>
>> Since I'm not a netdev person, I appreciate if you can explain
>> that shutdown sequence using a flow chart.
>>
>
> The attached patch adds a tracepoint to notifier_call_chain. If you have
> KALLSYMS enabled it will show the order of the function handlers:
>
> perf record -e notifier:* -a -g &
>
> ip netns del <NAME>
> <wait a few seconds>
>
> fg
> <ctrl-c on perf-record>
>
> perf script
>

with the header file this time.
From de8bfae0606d748908a70a435fee9d9ce57b13ea Mon Sep 17 00:00:00 2001
From: David Ahern <dsahern@xxxxxxxxx>
Date: Mon, 29 Apr 2019 11:38:49 -0700
Subject: [PATCH] notifier: add tracepoint to notifier_call_chain

Signed-off-by: David Ahern <dsahern@xxxxxxxxx>
---
include/trace/events/notifier.h | 49 +++++++++++++++++++++++++++++++++++++++++
kernel/notifier.c | 3 +++
2 files changed, 52 insertions(+)
create mode 100644 include/trace/events/notifier.h

diff --git a/include/trace/events/notifier.h b/include/trace/events/notifier.h
new file mode 100644
index 000000000000..7c531a1135cb
--- /dev/null
+++ b/include/trace/events/notifier.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM notifier
+
+#if !defined(_TRACE_NOTIFIER_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_NOTIFIER_H
+
+#include <linux/notifier.h>
+#include <linux/kallsyms.h>
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(notifier_call_chain,
+
+ TP_PROTO(struct notifier_block *nb, unsigned long val),
+
+ TP_ARGS(nb, val),
+
+ TP_STRUCT__entry(
+ __field( u64, val )
+ __field( u64, fcn )
+ __dynamic_array(char, fcnstr, KSYM_SYMBOL_LEN)
+ ),
+
+ TP_fast_assign(
+ void *p = nb->notifier_call;
+ char sym[KSYM_SYMBOL_LEN];
+
+ __entry->val = val;
+ __entry->fcn = (u64) p;
+
+ p = dereference_symbol_descriptor(p);
+#ifdef CONFIG_KALLSYMS
+ sprint_symbol_no_offset(sym, __entry->fcn);
+ /* avoid a bogus warning:
+ * "the address of sym will always evaluate as true"
+ * by using &sym[0]
+ */
+ __assign_str(fcnstr, &sym[0]);
+#else
+ __entry->fcnstr[0] = '\0';
+#endif
+ ),
+
+ TP_printk("val %lld fcn %llx name %s", __entry->val, __entry->fcn, __get_str(fcnstr))
+);
+#endif /* _TRACE_NOTIFIER_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 6196af8a8223..9b65a9c56fd7 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -5,6 +5,8 @@
#include <linux/rcupdate.h>
#include <linux/vmalloc.h>
#include <linux/reboot.h>
+#define CREATE_TRACE_POINTS
+#include <trace/events/notifier.h>

/*
* Notifier list for kernel code which wants to be called
@@ -90,6 +92,7 @@ static int notifier_call_chain(struct notifier_block **nl,
continue;
}
#endif
+ trace_notifier_call_chain(nb, val);
ret = nb->notifier_call(nb, val, v);

if (nr_calls)
--
2.11.0