Re: [PATCH V8 net-next 05/11] net: hibmcge: Implement some .ndo functions
From: Jijie Shao
Date: Mon Sep 09 2024 - 10:30:59 EST
on 2024/9/9 20:19, Andrew Lunn wrote:
On Mon, Sep 09, 2024 at 12:04:53PM +0800, Jijie Shao wrote:
on 2024/9/9 11:05, Kalesh Anakkur Purayil wrote:
On Mon, Sep 9, 2024 at 8:11 AM Jijie Shao <shaojijie@xxxxxxxxxx> wrote:
+}
+
+static int hbg_net_open(struct net_device *netdev)
+{
+ struct hbg_priv *priv = netdev_priv(netdev);
+
+ if (test_and_set_bit(HBG_NIC_STATE_OPEN, &priv->state))
+ return 0;
[Kalesh] Is there a possibility that dev_open() can be invoked twice?
We want stop NIC when chang_mtu 、self_test or FLR.
So, driver will directly invoke hbg_net_stop() not dev_open() if need.
Therefore, driver must ensure that hbg_net_open() or hbg_net_stop() can not be invoked twice.
Generally, we don't want defensive programming. You seem to suggest
hbg_net_open and hbg_net_stop are called in pairs. If this is not
true, you have a bug? Rather than paper over the bug with a return,
let bad things happen so the bug is obvious.
No, HBG_NIC_STATE_OPEN is not intended to ensure that hbg_net_open() and
hbg_net_stop() are mutually exclusive.
Actually, when the driver do reset or self-test(ethtool -t or ethtool --reset or FLR).
We hope that no other data is transmitted or received at this time.
Therefore, the driver directly uses hbg_net_stop() to stop the NIC.
In this case, IFF_UP may be set. Therefore, dev_close() can be invoked.
As a result, hbg_net_stop is invoked twice.
In my opinion, driver is not suitable for directly using dev_open(), dev_close(),
or modifying dev->state. Therefore, HBG_NIC_STATE_OPEN is added to
ensure that hbg_net_stop() is not invoked twice.
if I remove HBG_NIC_STATE_OPEN, I may get log(10s sleep time is added during reset):
#ifconfig enp131s0f1 up
#echo 1 > /sys/bus/pci/devices/0000\:83\:00.1/reset & sleep 2; ifconfig enp131s0f1 down
[ 213.332855] hibmcge 0000:83:00.1: FLR prepare
[ 213.337905] [STUB][hbg_net_stop 81] HBG_NIC_STATE_OPEN already clear
[ 213.345064] hibmcge 0000:83:00.1 enp131s0f1: Link is Down
[ 213.351408] [STUB][hbg_reset_prepare 126] reset sleep 10s after hbg_net_stop
[ 215.359812] [STUB][hbg_net_stop 81] HBG_NIC_STATE_OPEN already clear
[ 223.991959] hibmcge 0000:83:00.1: rebuild success
[ 223.998656] ------------[ cut here ]------------
[ 223.998987] hibmcge 0000:83:00.1: reset done
[ 224.003950] called from state HALTED
[ 224.008898] hibmcge 0000:83:00.1: FLR done
[ 224.008926] WARNING: CPU: 7 PID: 4381 at drivers/net/phy/phy.c:1330 phy_stop+0x118/0x160
[ 224.022412] Modules linked in: hibmcge(OE) nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw iptable_security ip_set nfnetlink rfkill ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sunrpc nls_cp437 vfat fat ipmi_ssif hns_roce_hw_v2 sg ib_uverbs ib_core acpi_ipmi hisi_uncore_ddrc_pmu hisi_uncore_l3c_pmu hisi_uncore_hha_pmu ipmi_si hisi_uncore_pmu ipmi_devintf ipmi_msghandler sch_fq_codel fuse ext4 mbcache jbd2 sd_mod realtek t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 hclge hisi_sas_v3_hw crct10dif_ce hisi_sas_main uas ghash_ce libsas sha2_ce ahci libahci scsi_transport_sas sha256_arm64 sha1_ce sbsa_gwdt usb_storage hns3 nfit libata hnae3 libnvdimm i2c_designware_platform i2c_designware_core dm_mirror dm_region_hash
[ 224.022504] dm_log dm_mod aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher
[ 224.022514] CPU: 7 PID: 4381 Comm: ifconfig Kdump: loaded Tainted: G OE 6.4.0+ #1
[ 224.022517] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, BIOS 1.93 10/13/2022
[ 224.022519] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 224.022521] pc : phy_stop+0x118/0x160
[ 224.022524] lr : phy_stop+0x118/0x160
[ 224.022527] sp : ffff80003375ba80
[ 224.022528] x29: ffff80003375ba80 x28: ffff20200a7f1a00 x27: 0000000000000000
[ 224.022533] x26: 0000000000000001 x25: 0000000000000000 x24: ffff20200c1d2260
[ 224.022537] x23: ffff20200c1d2070 x22: ffff20200c1d2000 x21: ffffb9747e21e514
[ 224.022541] x20: ffff20200c1d2980 x19: ffff002092d1d800 x18: 0000000000000020
[ 224.022544] x17: 0000000000000000 x16: ffffb9747e068f30 x15: ffffffffffffffff
[ 224.022548] x14: 0000000000000000 x13: 4445544c41482065 x12: 74617473206d6f72
[ 224.022552] x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffb9747d978500
[ 224.022556] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 000000000005fff4
[ 224.022560] x5 : 00000000002bffa8 x4 : 0000000000000000 x3 : 0000000000000000
[ 224.022564] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0021099c4800
[ 224.022568] Call trace:
[ 224.022571] phy_stop+0x118/0x160
[ 224.022574] hbg_phy_stop+0x20/0x38 [hibmcge]
[ 224.022583] hbg_net_stop+0x74/0xd0 [hibmcge]
[ 224.022590] __dev_close_many+0xbc/0x170
[ 224.022595] __dev_change_flags+0x120/0x300
[ 224.022600] dev_change_flags+0x2c/0x80
[ 224.022602] devinet_ioctl+0x63c/0x700
[ 224.022609] inet_ioctl+0x1e4/0x200
[ 224.022611] sock_do_ioctl+0x50/0x108
[ 224.022616] sock_ioctl+0x120/0x388
[ 224.022618] __arm64_sys_ioctl+0xb0/0x100
[ 224.022623] invoke_syscall+0x50/0x128
[ 224.022627] el0_svc_common.constprop.0+0x158/0x188
[ 224.022629] do_el0_svc+0x34/0x50
[ 224.022631] el0_svc+0x28/0xe0
[ 224.022637] el0t_64_sync_handler+0xb8/0xc0
[ 224.022640] el0t_64_sync+0x188/0x190
[ 224.022643] ---[ end trace 0000000000000000 ]---
If it is suitable to call dev_close() directly in driver,
I think HBG_NIC_STATE_OPEN can be removed.
Thanks
Jijie Shao