Re: Re: [PATCH linux next] net:dsa:fix the dsa_ptr null pointer dereference

From: PeilinHe
Date: Tue Dec 24 2024 - 03:06:29 EST


>Thank you for the patch.
>
>There are many process problems with it however.
>
>The most glaring one is that you are examining a crash from kernel 5.4
>but patching linux-next, without having apparently also tested linux-next.
>It appears that you just made a static analysis which may result in
>incorrect conclusions. When submitting patches upstream you always have
>to test on the latest version and understand afterwards what is missing
>and needs to be backported in the particular stable version you are using.
>
>In particular here, dsa_switch_shutdown() now has this:
>
> dsa_switch_for_each_user_port(dp, ds) {
> conduit = dsa_port_to_conduit(dp);
> user_dev = dp->user;
>
> netif_device_detach(user_dev);
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> netdev_upper_dev_unlink(conduit, user_dev);
> }
>
>After netif_device_detach() is called, my expectation is that
>ethnl_ops_begin() sees that netif_device_present() is false, so it
>returns -ENODEV and does not proceed further to call into the device's
>ethtool ops. So that eliminates the premise for the crash.
>
>Secondly, linux-next is not a kernel tree that accepts patches, it is
>just for integration. For netdev, we have net.git for bug fixes and
>net-next.git for new features. You have to target your patch to net.git
>by using "[PATCH net v1]".
>
>If the problem does not exist in net.git but exists in stable kernels,
>you have to identify which patches are missing, adapt them if necessary,
>and then send them to stable@xxxxxxxxxxxxxxx, with netdev and the other
>maintainers also CCed, and with a subject prefix along the lines of
>"[PATCH stable 5.4]". Generally, backporting patches manually to stable
>is rarely needed, so if that needs to happen, please use the space under
>the "---" marker (this is discarded when applying the patch in git) to
>explain to maintainers why (what conflicted, if it simply appears to
>have been missed, etc).
>
>There are other things to be aware of in Documentation/process/, I just
>summarized to you what I considered most relevant here.

Thank you for your feedback.
First, I apologize for an error in my previous commit description.
The kernel version that experienced the panic was actually 5.15.
Second, netif_device_present() does indeed prevent the null pointer
dereference of dsa_ptr.
Finally, this issue still persists in the 5.15 stable release,
therefore, following your suggestion, I will submit the relevant
patch to the 5.15 stable branch.