Re: [PATCH net v2] bnxt_en: Fix NULL pointer crash in bnxt_ptp_enable during error cleanup
From: Pavan Chebbi
Date: Mon Jan 05 2026 - 10:53:06 EST
On Mon, Jan 5, 2026 at 6:59 PM Russell King (Oracle)
<linux@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, Jan 05, 2026 at 04:00:16AM -0800, Breno Leitao wrote:
> > When bnxt_init_one() fails during initialization (e.g.,
> > bnxt_init_int_mode returns -ENODEV), the error path calls
> > bnxt_free_hwrm_resources() which destroys the DMA pool and sets
> > bp->hwrm_dma_pool to NULL. Subsequently, bnxt_ptp_clear() is called,
> > which invokes ptp_clock_unregister().
> >
> > Since commit a60fc3294a37 ("ptp: rework ptp_clock_unregister() to
> > disable events"), ptp_clock_unregister() now calls
> > ptp_disable_all_events(), which in turn invokes the driver's .enable()
> > callback (bnxt_ptp_enable()) to disable PTP events before completing the
> > unregistration.
> >
> > bnxt_ptp_enable() attempts to send HWRM commands via bnxt_ptp_cfg_pin()
> > and bnxt_ptp_cfg_event(), both of which call hwrm_req_init(). This
> > function tries to allocate from bp->hwrm_dma_pool, causing a NULL
> > pointer dereference:
>
> This has revealed a latent bug in this driver. All the time that the
> PTP clock is registered, userspace can interact with it, and thus
> bnxt_ptp_enable() can be called. ptp_clock_unregister() unpublishes
> that interface.
>
> ptp_clock_unregister() must always be called _before_ tearing down any
> resources that the PTP clock implementation may use.
>
> From what you describe, it sounds like this patch fixes that.
>
> Looking at the driver, however, it looks very suspicious.
>
> __bnxt_hwrm_ptp_qcfg() seems to be the place where PTP is setup and
> initialised (and ptp_clock_register() called in bnxt_ptp_init()).
>
> First, it looks like bnxt_ptp_init() will tear down an existing PTP
> clock via bnxt_ptp_free() before then re-registering it. That seems
> odd.
This is to handle the firmware capabilities changes post an update,
like you guessed.
>
> Second, __bnxt_hwrm_ptp_qcfg() calls bnxt_ptp_clear() if
> bp->hwrm_spec_code < 0x10801 || !BNXT_CHIP_P5_PLUS(bp) is true or
> hwrm_req_init() fails. Is it really possible that we have the PTP
> clock registered when PTP isn't supported?
Right, this check may not make much sense because we call
__bnxt_hwrm_ptp_qcfg() only after we know PTP is supported.
Michael may tell better but I think we could improve by removing that check.
>
> Third, same concern but with __bnxt_hwrm_func_qcaps().
This case is different? __bnxt_hwrm_func_qcaps() is always called post
fw reset, where the capability could have changed.
>
> My guess is that this has something to do with firmware, and maybe
> upgrading it at runtime - so if the firmware gets upgraded to a
> version that doesn't support PTP, the driver removes PTP. However,
> can PTP be used while firmware is being upgraded, and what happens
> if, e.g. bnxt_ptp_enable() were called mid-upgrade? Would that be
> safe?
Hm.. you have a point. This is a consequence of commit a60fc3294a37.
We never had this situation before.
As it stands now, from what I know, bnxt_ptp_enable()'s fw commands
will be no-ops.
But yes, to be correct, I think we should have a fw PTP capability
check inside bnxt_ptp_enable().
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature