Re: [PATCH 4.19 006/103] net: phy: phy driver features are mandatory

From: Holger HoffstÃtte
Date: Tue Jan 29 2019 - 11:57:48 EST


On 1/29/19 5:33 PM, Greg Kroah-Hartman wrote:
On Tue, Jan 29, 2019 at 04:05:16PM +0000, Camelia Alexandra Groza wrote:
-----Original Message-----
From: Holger HoffstÃtte <holger@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, January 29, 2019 17:50
To: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>; linux-
kernel@xxxxxxxxxxxxxxx
Cc: stable@xxxxxxxxxxxxxxx; Scott Wood <oss@xxxxxxxxxxxx>; Camelia
Alexandra Groza <camelia.groza@xxxxxxx>; David S. Miller
<davem@xxxxxxxxxxxxx>; Heiner Kallweit <hkallweit1@xxxxxxxxx>
Subject: Re: [PATCH 4.19 006/103] net: phy: phy driver features are
mandatory

On 1/29/19 12:34 PM, Greg Kroah-Hartman wrote:
4.19-stable review patch. If anyone has any objections, please let me
know.

------------------

From: Camelia Groza <camelia.groza@xxxxxxx>

[ Upstream commit 3e64cf7a435ed0500e3adaa8aada2272d3ae8abc ]

Since phy driver features became a link_mode bitmap, phy drivers that
don't have a list of features configured will cause the kernel to crash
when probed.

Prevent the phy driver from registering if the features field is missing.

Fixes: 719655a14971 ("net: phy: Replace phy driver features u32 with
link_mode bitmap")
Reported-by: Scott Wood <oss@xxxxxxxxxxxx>
Signed-off-by: Camelia Groza <camelia.groza@xxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
drivers/net/phy/phy_device.c | 5 +++++
include/linux/phy.h | 4 ++--
2 files changed, 7 insertions(+), 2 deletions(-)

--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1917,6 +1917,11 @@ int phy_driver_register(struct phy_drive
{
int retval;

+ if (WARN_ON(!new_driver->features)) {
+ pr_err("%s: Driver features are missing\n", new_driver-
name);
+ return -EINVAL;
+ }
+
new_driver->mdiodrv.flags |= MDIO_DEVICE_IS_PHY;
new_driver->mdiodrv.driver.name = new_driver->name;
new_driver->mdiodrv.driver.bus = &mdio_bus_type;
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -491,8 +491,8 @@ struct phy_device {
* only works for PHYs with IDs which match this field
* name: The friendly name of this PHY type
* phy_id_mask: Defines the important bits of the phy_id
- * features: A list of features (speed, duplex, etc) supported
- * by this PHY
+ * features: A mandatory list of features (speed, duplex, etc)
+ * supported by this PHY
* flags: A bitfield defining certain other features this PHY
* supports (like interrupts)
*


This patch leads to the following dump during boot and no working
network:

Jan 29 13:11:50 ragnarok kernel: RIP: 0010:phy_driver_register+0x5f/0x80
[libphy]
Jan 29 13:11:50 ragnarok kernel: Code: dd 06 a0 48 c7 47 40 30 e4 06 a0 48 89
77 10 c7 47 24 02 00 00 00 e8 10 5c 44 e1 89 c5 85 c0 0f 85 aa 1d 00 00 89 e8 5b
5d c3 <0f> 0b 48 c7 c7 f0 20 07 a0 48 89 c6 bd ea ff ff ff e8 c9 4c 02 e1
Jan 29 13:11:50 ragnarok kernel: RSP: 0018:ffffc90000df3c78 EFLAGS:
00010246
Jan 29 13:11:50 ragnarok kernel: RAX: ffffffffa0071c8d RBX: 0000000000000000
RCX: 0000000000000000
Jan 29 13:11:50 ragnarok kernel: RDX: 0000000000000000 RSI: ffffffffa0076900
RDI: ffffffffa0076040
Jan 29 13:11:50 ragnarok kernel: RBP: ffffffffa007a000 R08: 0000000000000044
R09: 0000000000000228
Jan 29 13:11:50 ragnarok kernel: R10: ffff888603a0d6f8 R11:
000000000000003c R12: ffff888601252d00
Jan 29 13:11:50 ragnarok kernel: R13: 0000000000000002 R14:
ffffc90000df3e98 R15: ffffffffa0076900
Jan 29 13:11:50 ragnarok kernel: FS: 00007feea9ced800(0000)
GS:ffff888607240000(0000) knlGS:0000000000000000
Jan 29 13:11:50 ragnarok kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Jan 29 13:11:50 ragnarok kernel: CR2: 00007feea9cdfc42 CR3:
00000005fec64001 CR4: 00000000000606e0
Jan 29 13:11:50 ragnarok kernel: Call Trace:
Jan 29 13:11:50 ragnarok kernel: ? 0xffffffffa007a000
Jan 29 13:11:50 ragnarok kernel: phy_init+0x24/0x58 [libphy]
Jan 29 13:11:50 ragnarok kernel: do_one_initcall+0x43/0x1af
Jan 29 13:11:50 ragnarok kernel: ? _cond_resched+0x15/0x30
Jan 29 13:11:50 ragnarok kernel: ? kmem_cache_alloc_trace+0x154/0x1b0
Jan 29 13:11:50 ragnarok kernel: do_init_module+0x5a/0x210
Jan 29 13:11:50 ragnarok kernel: load_module+0x203a/0x2500
Jan 29 13:11:50 ragnarok kernel: ? vfs_read+0x10d/0x130
Jan 29 13:11:50 ragnarok kernel: ? __se_sys_finit_module+0xb1/0xd0
Jan 29 13:11:50 ragnarok kernel: __se_sys_finit_module+0xb1/0xd0
Jan 29 13:11:50 ragnarok kernel: do_syscall_64+0x3e/0xe0
Jan 29 13:11:50 ragnarok kernel:
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jan 29 13:11:50 ragnarok kernel: RIP: 0033:0x7feeaa2417c9
Jan 29 13:11:50 ragnarok kernel: Code: 00 00 00 75 05 48 83 c4 18 c3 e8 82 9c 01
00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05
<48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 87 c6 0c 00 f7 d8 64 89 01 48
Jan 29 13:11:50 ragnarok kernel: RSP: 002b:00007fff6fa6ea08 EFLAGS:
00000246 ORIG_RAX: 0000000000000139
Jan 29 13:11:50 ragnarok kernel: RAX: ffffffffffffffda RBX: 000055c6eac06db0
RCX: 00007feeaa2417c9
Jan 29 13:11:50 ragnarok kernel: RDX: 0000000000000000 RSI:
00007feeaa4198fd RDI: 0000000000000006
Jan 29 13:11:50 ragnarok kernel: RBP: 00007feeaa4198fd R08:
0000000000000000 R09: 000055c6eac03000
Jan 29 13:11:50 ragnarok kernel: R10: 0000000000000006 R11:
0000000000000246 R12: 0000000000000000
Jan 29 13:11:50 ragnarok kernel: R13: 000055c6eabfc1d0 R14:
0000000000020000 R15: 000055c6eac06db0
Jan 29 13:11:50 ragnarok kernel: ---[ end trace 18e05a0afeb140b2 ]---
Jan 29 13:11:50 ragnarok kernel: libphy: Generic 10G PHY: Driver features are
missing

Apparently the r8169 driver for my NIC does something wrong wrt. this
fix and the original commit (719655a14971), but regardless - reverting this
patch leads to a booting & working 4.19.19:

Jan 29 13:29:16 ragnarok kernel: libphy: r8169: probed
Jan 29 13:29:19 ragnarok kernel: RTL8211E Gigabit Ethernet r8169-400:00:
attached PHY driver [RTL8211E Gigabit Ethernet] (mii_bus:phy_addr=r8169-
400:00, irq=IGNORE)

I've cc'ed Heiner Kallweit who might know why phylib with this fix and r8169
don't like each other.

cheers,
Holger

Hi

This points to an issue with the Generic 10G PHY driver. The following patch seems to be missing from the 4.19 stable tree and should be merged:
f802912 net: phy: genphy_10g_driver: Avoid NULL pointer dereference

But that patch says it:
Fixes: 719655a14971 ("net: phy: Replace phy driver features u32 with link_mode bitmap")

Which is not in 4.19.y, it showed up in 4.20.

Nope, as you found 719655a14971 was backported:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.19.y&id=719655a149715f26fc4de904fe0aa83068bd5b9e

Ah, wait, I think that there was a patch backported to the 4.19.y queue
that should not have, specifically:
3e64cf7a435e ("net: phy: phy driver features are mandatory")
which said it too fixes:
719655a14971 ("net: phy: Replace phy driver features u32 with link_mode bitmap")


Holger, if you revert 3e64cf7a435e, does that solve the problem for you
here?

Yes, as per $SUBJECT if I patch --reverse 3e64cf7a435e it works again,
seem my original log above.

Fwiw just test-booted 5.0-rc4 and it works too.

cheers,
Holger