Re: Linux 4.4.34

From: Eric Dumazet
Date: Tue Nov 22 2016 - 12:26:49 EST


On Tue, Nov 22, 2016 at 9:22 AM, Andre Noll <maan@xxxxxxxxxxxxxxxx> wrote:
> On Tue, Nov 22, 18:06, Greg KH wrote
>> On Tue, Nov 22, 2016 at 05:59:12PM +0100, Andre Noll wrote:
>> > On Mon, Nov 21, 10:28, Greg KH wrote
>> > > I'm announcing the release of the 4.4.34 kernel.
>> > >
>> > > All users of the 4.4 kernel series must upgrade.
>> >
>> > This update broke PXE boot on our 4-way AMD boxes. The kernel panics in
>> > eth_type_trans(), presumably during kernel-level IP autoconfiguration,
>> > see [1]. Bisection points me at 5c67f947 (net: __skb_flow_dissect()
>> > must cap its return value). And indeed, reverting this commit fixes
>> > the problem for me.
>> >
>> > Investigation showed that the real problem is not the change in the
>> > above commit per se (i.e., capping ->thoff) but the fact that in the
>> > success case, where we jump to the "out_good" label, ->thoff is now
>> > set *after* ->n_proto and ->ip_proto. I fail to see how order matters
>> > here, but it clearly does, since the crash is 100% reproducible,
>> > and is fixed by the commit below (on top of v4.4.34).
>> >
>> > Please consider applying something like the patch below for mainline
>> > and -stable.
>>
>> If this issue is also the same for Linus's tree, we should cc: netdev so
>> that the patch can get into there, right?
>
> Right, but I haven't tested PXE boot on any kernel newer than 4.4.x
> so far. All I can say for sure is that the problematic commit is
> also in Linus' tree (called 34fad54c there).
>
> Do you want me to check if mainline is also affected?
>

Mainline is affected, we had a report of someone using IGB and hitting the bug.

We now have a hint based on your patch, thanks !