Re: [PATCH] Revert "net: linkwatch: add check for netdevice being present to linkwatch_do_dev"

From: David Miller
Date: Fri Sep 11 2020 - 20:44:22 EST


From: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
Date: Fri, 11 Sep 2020 08:32:55 +0200

> Hi David,
>
> On Thu, Sep 10, 2020 at 9:20 PM David Miller <davem@xxxxxxxxxxxxx> wrote:
>> From: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
>> Date: Tue, 1 Sep 2020 17:02:37 +0200
>>
>> > This reverts commit 124eee3f6955f7aa19b9e6ff5c9b6d37cb3d1e2c.
>> >
>> > Inami-san reported that this commit breaks bridge support in a Xen
>> > environment, and that reverting it fixes this.
>> >
>> > During system resume, bridge ports are no longer enabled, as that relies
>> > on the receipt of the NETDEV_CHANGE notification. This notification is
>> > not sent, as netdev_state_change() is no longer called.
>> >
>> > Note that the condition this commit intended to fix never existed
>> > upstream, as the patch triggering it and referenced in the commit was
>> > never applied upstream. Hence I can confirm s2ram on r8a73a4/ape6evm
>> > and sh73a0/kzm9g works fine before/after this revert.
>> >
>> > Reported-by Gaku Inami <gaku.inami.xh@xxxxxxxxxxx>
>> > Signed-off-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
>>
>> Maybe you cannot reproduce it, but the problem is there and it still
>> looks very real to me.
>>
>> netdev_state_change() does two things:
>>
>> 1) Emit the NETDEV_CHANGE notification
>>
>> 2) Emit an rtmsg_ifinfo() netlink message, which in turn tries to access
>> the device statistics via ->ndo_get_stats*().
>>
>> It is absolutely wrong to do #2 when netif_device_present() is false.
>>
>> So I cannot apply this patch as-is, sorry.
>
> Thanks a lot for looking into this!
>
> But doing #1 is still safe? That is the part that calls into the bridge
> code. So would moving the netif_device_present() check from
> linkwatch_do_dev() to netdev_state_change(), to prevent doing #2, be
> acceptable?

I have a better question. Why is a software device like the bridge,
that wants to effectively exist and still receive netdev event
notifications, marking itself as not present?

That's seems like the real bug here.