Re: [PATCH v2 net-next] net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()

From: Nicholas Johnson
Date: Mon Nov 25 2019 - 04:10:13 EST


On Mon, Nov 25, 2019 at 11:25:50AM +0300, Alexander Lobakin wrote:
> Alexander Lobakin wrote 25.11.2019 10:54:
> > Nicholas Johnson wrote 25.11.2019 10:29:
> > > Hi,
> > >
> > > On Wed, Oct 16, 2019 at 10:31:31AM +0300, Alexander Lobakin wrote:
> > > > David Miller wrote 16.10.2019 04:16:
> > > > > From: Alexander Lobakin <alobakin@xxxxxxxx>
> > > > > Date: Mon, 14 Oct 2019 11:00:33 +0300
> > > > >
> > > > > > Commit 323ebb61e32b4 ("net: use listified RX for handling GRO_NORMAL
> > > > > > skbs") made use of listified skb processing for the users of
> > > > > > napi_gro_frags().
> > > > > > The same technique can be used in a way more common napi_gro_receive()
> > > > > > to speed up non-merged (GRO_NORMAL) skbs for a wide range of drivers
> > > > > > including gro_cells and mac80211 users.
> > > > > > This slightly changes the return value in cases where skb is being
> > > > > > dropped by the core stack, but it seems to have no impact on related
> > > > > > drivers' functionality.
> > > > > > gro_normal_batch is left untouched as it's very individual for every
> > > > > > single system configuration and might be tuned in manual order to
> > > > > > achieve an optimal performance.
> > > > > >
> > > > > > Signed-off-by: Alexander Lobakin <alobakin@xxxxxxxx>
> > > > > > Acked-by: Edward Cree <ecree@xxxxxxxxxxxxxx>
> > > > >
> > > > > Applied, thank you.
> > > >
> > > > David, Edward, Eric, Ilias,
> > > > thank you for your time.
> > > >
> > > > Regards,
> > > > á á á á á á
> > >
> > > I am very sorry to be the bearer of bad news. It appears that this
> > > commit is causing a regression in Linux 5.4.0-rc8-next-20191122,
> > > preventing me from connecting to Wi-Fi networks. I have a Dell XPS
> > > 9370
> > > (Intel Core i7-8650U) with Intel Wireless 8265 [8086:24fd].
> >
> > Hi!
> >
> > It's a bit strange as this commit doesn't directly affect the packet
> > flow. I don't have any iwlwifi hardware at the moment, so let's see if
> > anyone else will be able to reproduce this (for now, it is the first
> > report in a ~6 weeks after applying to net-next).
> > Anyway, I'll investigate iwlwifi's Rx processing -- maybe I could find
> > something driver-specific that might produce this.
Just in case, I double checked by reapplying the patch to check it is
the problem. The problem reappeared. So I am sure.

Here's what I will do. I know somebody with the same Dell XPS 9370,
except theirs has the Intel Core i7 8550U and Killer Wi-Fi. Mine is the
"business" model, which was harder to obtain. I have been doing bisects
on a USB-C SSD because I do not have enough space on the internal NVMe
drive. I will ask to borrow their laptop, and boot off the drive as I
have been doing with my laptop. If the problem does not appear on their
laptop, then there is a good chance that the problem is specific to
iwlwifi.

> >
> > Thank you for the report.
> >
> > > I did a bisect, and this commit was named the culprit. I then applied
> > > the reverse patch on another clone of Linux next-20191122, and it
> > > started working.
> > >
> > > 6570bc79c0dfff0f228b7afd2de720fb4e84d61d
> > > net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()
> > >
> > > You can see more at the bug report I filed at [0].
> > >
> > > [0]
> > > https://bugzilla.kernel.org/show_bug.cgi?id=205647
> > >
> > > I called on others at [0] to try to reproduce this - you should not
> > > pull
> > > a patch because of a single reporter - as I could be wrong.
> > >
> > > Please let me know if you want me to give more debugging information
> > > or
> > > test any potential fixes. I am happy to help to fix this. :)
>
> And you can also set /proc/sys/net/core/gro_normal_batch to the value
> of 1 and see if there are any changes. This value makes GRO stack to
> behave just like without the patch.
The default value of /proc/sys/net/core/gro_normal_batch was 8.

Setting it to 1 allowed it to connect to Wi-Fi network.

Setting it back to 8 did not kill the connection.

But when I disconnected and tried to reconnect, it did not re-connect.

Hence, it appears that the problem only affects the initial handshake
when associating with a network, and not normal packet flow.

>
> > > Kind regards,
> > > Nicholas Johnson
> >
> > Regards,
> > á á á á á á
>
> Regards,
> á á á á á á

Regards,
Nicholas