RE: [PATCH iwl-next v11] ice: add support for unmanaged DPLL on E830 NIC

From: Kubalewski, Arkadiusz

Date: Fri May 08 2026 - 07:28:36 EST

>From: Keller, Jacob E <jacob.e.keller@xxxxxxxxx>
>Sent: Tuesday, May 5, 2026 12:31 AM
>
>On 2/17/2026 7:58 AM, Arkadiusz Kubalewski wrote:
>> Hardware variants of E830 may support an unmanaged DPLL where the
>> configuration is hardcoded within the hardware and firmware, meaning
>> users cannot modify settings. However, users are able to check the DPLL
>> lock status and obtain configuration information through the Linux DPLL
>> and devlink health subsystem.
>>
>> Availability of 'loss of lock' health status code determines if such
>> support is available, if true, register single DPLL device with 1 input
>> and 1 output and provide hardcoded/read only properties of a pin and
>> DPLL device. User is only allowed to check DPLL device status and
>> receive
>> notifications on DPLL lock status change.
>>
>> When present, the DPLL device locks to an external signal provided
>> through the PCIe/OCP pin. The expected input signal is 1PPS
>> (1 Pulse Per Second) embedded on a 10MHz reference clock.
>> The DPLL produces output:
>> - for MAC (Media Access Control) & PHY (Physical Layer) clocks,
>> - 1PPS for synchronization of onboard PHC (Precision Hardware Clock)
>> timer.
>>
>> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@xxxxxxxxx>
>> Reviewed-by: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
>> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@xxxxxxxxx>
>> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@xxxxxxxxx>
>> ---
>> v11:
>> - rebase and fix conflicts
>
>
>Jakub rightfully pointed out the mistaken inclusion of HAVE_DPLL_ESYNC.
>

True, fixed in v12.

>> diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c
>> b/drivers/net/ethernet/intel/ice/ice_dpll.c
>> index 73a4e28ed75f..64518202dfff 100644
>> --- a/drivers/net/ethernet/intel/ice/ice_dpll.c
>> +++ b/drivers/net/ethernet/intel/ice/ice_dpll.c
>> @@ -2588,6 +2601,21 @@ static const struct dpll_pin_ops
>> ice_dpll_output_ops = {
>> .esync_get = ice_dpll_output_esync_get,
>> };
>>
>> +static const struct dpll_pin_ops ice_dpll_input_unmanaged_ops = {
>> + .frequency_get = ice_dpll_input_frequency_get,
>> + .direction_get = ice_dpll_input_direction,
>> + .state_on_dpll_get = ice_dpll_input_state_get,
>> +#if defined(HAVE_DPLL_ESYNC)
>> + .esync_get = ice_dpll_input_esync_get,
>> +#endif /* HAVE_DPLL_ESYNC */
>> +};
>> +
>Please fix this, and remove any other remaining kcompat cruft as well.
>
>Also could you please review the following Sashiko reports and make sure
>there is no issues?
>

Sure thing.

>The netdev sashiko.dev bot says this:
>
>> commit 7aae4a54d3ef18d8694bc535b6e5959839c9adfd
>> Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@xxxxxxxxx>
>> ice: add support for unmanaged DPLL on E830 NIC
>> Hardware variants of E830 may support an unmanaged DPLL where the
>> configuration is hardcoded. This patch allows users to check the DPLL
>> lock status and obtain configuration information through the Linux DPLL
>> and devlink health subsystem.
>>> diff --git a/drivers/net/ethernet/intel/ice/devlink/health.c
>>>b/drivers/net/ethernet/intel/ice/devlink/health.c
>>> --- a/drivers/net/ethernet/intel/ice/devlink/health.c
>>> +++ b/drivers/net/ethernet/intel/ice/devlink/health.c
>> [ ... ]
>>> @@ -242,6 +244,8 @@ void ice_process_health_status_event(struct ice_pf
>>>*pf, struct ice_rq_event_info
>>> pf->health_reporters.fw_status = *health_info;
>>> devlink_health_report(pf->health_reporters.fw,
>>> "FW syndrome reported", NULL);
>>> + if (status_code ==
>>> ICE_AQC_HEALTH_STATUS_INFO_LOSS_OF_LOCK)
>>> + ice_dpll_lock_state_set_unmanaged(pf,
>> health_info, true);
>> Is it possible for this health event to arrive when the DPLL subsystem
>> is not
>> fully initialized or already torn down?
>> Looking at ice_dpll_lock_state_set_unmanaged(), it unconditionally
>> acquires
>> pf->dplls.lock:
>> void ice_dpll_lock_state_set_unmanaged(...)
>> {
>> ...
>> mutex_lock(&pf->dplls.lock);
>> ...
>> }
>> If the hardware doesn't support DPLL, ice_dpll_init() is never called,
>> leaving
>> the mutex uninitialized. Additionally, if ice_dpll_init_unmanaged()
>> fails,
>> it destroys the mutex and exits.
>> During driver unload, ice_dpll_deinit() destroys the mutex and frees the
>> DPLL structures, but the service task processing these admin queue
>> events
>> might not be stopped until later in the teardown sequence.
>> Could this lead to a kernel panic from locking an uninitialized or
>> destroyed
>> mutex, or a use-after-free when passing the d->dpll pointer to
>> dpll_device_change_ntf()?
>> Should there be a check ensuring DPLL is active (e.g. by checking
>> test_bit(ICE_FLAG_DPLL, pf->flags)) before handling the event?
>
>
>My own execution using claude had this to say:
>

Sure, fix in v12.
Except last finding.

>> Now I have everything. Let me compose the report.
>> commit d9b481c1d5d8be9dbbca36f40be64db91eb12caa
>> Author: Arkadiusz Kubalewski <arkadiusz.kubalewski@xxxxxxxxx>
>> ice: add support for unmanaged DPLL on E830 NIC
>> This patch adds support for an unmanaged DPLL on E830 NIC variants,
>> where the DPLL configuration is hardcoded in hardware/firmware. It
>> registers a single PPS DPLL device with one input and one output,
>> reporting lock status via health status events and the DPLL subsystem.
>> A few things stood out during review.
>>> diff --git a/drivers/net/ethernet/intel/ice/devlink/health.c
>>> b/drivers/net/ethernet/intel/ice/devlink/health.c
>>> index 8e9a8a8178d4..31e6c5107c97 100644
>>> --- a/drivers/net/ethernet/intel/ice/devlink/health.c
>>> +++ b/drivers/net/ethernet/intel/ice/devlink/health.c
>> [ ... ]
>>> @@ -242,6 +244,8 @@ void ice_process_health_status_event(struct ice_pf
>>> *pf, struct ice_rq_event_info
>>> pf->health_reporters.fw_status = *health_info;
>>> devlink_health_report(pf->health_reporters.fw,
>>> "FW syndrome reported", NULL);
>>>+ if (status_code ==
>>>ICE_AQC_HEALTH_STATUS_INFO_LOSS_OF_LOCK)
>>>+ ice_dpll_lock_state_set_unmanaged(pf,
>>>health_info, true);
>>> break;
>> This calls ice_dpll_lock_state_set_unmanaged() for every PF that
>> receives a 0x601 health event, with no check for pf->dplls.unmanaged
>> or test_bit(ICE_FLAG_DPLL, pf->flags).
>> The only guard inside ice_dpll_lock_state_set_unmanaged() is
>> ice_pf_src_tmr_owned(), which checks PTP timer ownership -- not
>> whether the DPLL subsystem was initialized.
>> Consider a device where the firmware supports health code 0x601 but
>> the ZL80640 CGU is not in the netlist. In ice_init_features():
>>>+ err = ice_is_health_status_code_supported(&pf->hw, code,
>>>+ &pf->dplls.unmanaged);
>>>+ if (err || !ice_is_unmanaged_cgu_in_netlist(&pf->hw))
>>>+ pf->dplls.unmanaged = false;
>> This sets pf->dplls.unmanaged = false, so ice_dpll_init() is never
>> called and pf->dplls.lock is never initialized via mutex_init().
>> But the firmware can still send the health event, and then:
>>>+void ice_dpll_lock_state_set_unmanaged(struct ice_pf *pf,
>>>+ const struct ice_aqc_health_status_elem
>>>*buff,
>>>+ bool notify)
>>>+{
>> [ ... ]
>>>+ mutex_lock(&pf->dplls.lock);
>> Can this mutex_lock() on an uninitialized mutex cause undefined
>> behavior? With CONFIG_DEBUG_MUTEXES this would trigger a BUG().
>> Also, on managed DPLL devices (e.g. E810 with CGU), the DPLL worker
>> thread maintains pf->dplls.pps.dpll_state via ice_dpll_update_state().
>> Would this unconditional call overwrite the managed state machine?
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c
>>>b/drivers/net/ethernet/intel/ice/ice_dpll.c
>>> index 3eea83b0cdc9..00704dfeb3b2 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_dpll.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_dpll.c
>> [ ... ]
>>>+static const struct dpll_pin_ops ice_dpll_input_unmanaged_ops = {
>>>+ .frequency_get = ice_dpll_input_frequency_get,
>>>+ .direction_get = ice_dpll_input_direction,
>>>+ .state_on_dpll_get = ice_dpll_input_state_get,
>>>+#if defined(HAVE_DPLL_ESYNC)
>>>+ .esync_get = ice_dpll_input_esync_get,
>>>+#endif /* HAVE_DPLL_ESYNC */
>>>+};
>> HAVE_DPLL_ESYNC does not appear to be defined anywhere in the upstream
>> kernel tree -- it looks like an out-of-tree compatibility macro. In
>> upstream builds this means .esync_get is always NULL for unmanaged
>> input pins.
>> The managed ice_dpll_input_ops sets .esync_get unconditionally, and
>> ice_dpll_input_esync_get() already has explicit unmanaged-specific
>> logic added by this patch. The init code also unconditionally sets
>> ICE_AQC_GET_CGU_IN_CFG_STATUS_ESYNC_CAP and ICE_DPLL_IN_ESYNC_ENABLED
>> on unmanaged input pins.
>> The documentation example at the end of the patch shows
>> esync-frequency and esync-pulse in the pin dump, but those fields
>> will never appear in an upstream build since the callback is compiled
>> out. Should the ifdef be removed so this matches ice_dpll_input_ops?
>> [ ... ]
>>> @@ -3153,12 +3181,15 @@ ice_dpll_init_direct_pins(struct ice_pf *pf,
>>> bool cgu,
>>> int ret;
>>>
>>> ret = ice_dpll_get_pins(pf, pins, start_idx, count, pf-
>>> dplls.clock_id);
>>>- if (ret)
>>>+ if (!cgu || ret)
>>> return ret;
>>>- if (cgu) {
>>>+
>>>+ if (first) {
>>> ret = ice_dpll_register_pins(first, pins, ops, count);
>>> if (ret)
>>> goto release_pins;
>>>+ }
>>>+ if (second) {
>>> ret = ice_dpll_register_pins(second, pins, ops, count);
>>> if (ret)
>>> goto unregister_first;
>> ice_dpll_init_direct_pins() was updated with NULL guards for first
>> and second. However, ice_dpll_deinit_direct_pins() was not:
>> ice_dpll_deinit_direct_pins() {
>> if (cgu) {
>> ice_dpll_unregister_pins(first, pins, ops, count);
>> ice_dpll_unregister_pins(second, pins, ops, count);
>> }
>> ...
>> }
>> For the unmanaged case, ice_dpll_init_unmanaged() only creates the
>> PPS dpll -- eec.dpll stays NULL. If output pin init fails inside
>> ice_dpll_init_pins(pf, true), the error path falls through to
>> deinit_inputs which calls:
>> ice_dpll_deinit_direct_pins(pf, cgu=true, ...,
>> pf->dplls.pps.dpll,
>> pf->dplls.eec.dpll);
>> ^^^^
>> NULL
>> This passes NULL as second to ice_dpll_unregister_pins(), which
>> calls dpll_pin_unregister(NULL, ...), dereferencing the NULL dpll
>> pointer. Would it make sense to add the same NULL guards to
>> ice_dpll_deinit_direct_pins() that were added to the init path?
>> [ ... ]
>>> @@ -3923,6 +4001,15 @@ ice_dpll_init_info_direct_pins(struct ice_pf
>>> *pf,
>>> pins[i].prop.board_label = ice_cgu_get_pin_name(hw, i, input);
>>> pins[i].prop.type = ice_cgu_get_pin_type(hw, i, input);
>>> if (input) {
>>>+ if (pf->dplls.unmanaged) {
>>>+ pins[i].freq = ice_dpll_pin_freq_info(hw, i,
>>>+ input);
>>>+ pins[i].state[0] = DPLL_PIN_STATE_CONNECTED;
>>>+ pins[i].status =
>>>+ ICE_AQC_GET_CGU_IN_CFG_STATUS_ESYNC_CAP;
>>>+ pins[i].flags[0] = ICE_DPLL_IN_ESYNC_ENABLED;
>>>+ continue;
>>>+ }
>> [ ... ]
>>> } else {
>>>+ if (pf->dplls.unmanaged) {
>>>+ pins[i].freq = ice_dpll_pin_freq_info(hw, i,
>>>+ input);
>>>+ pins[i].state[0] = DPLL_PIN_STATE_CONNECTED;
>>>+ continue;
>>>+ }
>> [ ... ]
>>> pins[i].prop.freq_supported =
>>> ice_cgu_get_pin_freq_supp(hw, i, input, &freq_supp_num);
>>> pins[i].prop.freq_supported_num = freq_supp_num;
>>> pins[i].pf = pf;
>>> }
>> Both unmanaged branches use continue, which skips the assignments of
>> pins[i].prop.freq_supported, pins[i].prop.freq_supported_num, and
>> pins[i].pf at the bottom of the loop.
>> The missing freq_supported / freq_supported_num means the DPLL netlink
>> code cannot report supported frequency ranges to userspace for
>> unmanaged pins, even though the data is available from
>> ice_cgu_get_pin_freq_supp().
>> The missing pins[i].pf = pf assignment leaves the pointer NULL (from
>> kcalloc zeroing). Current unmanaged callbacks use d->pf from
>> dpll_priv so there is no immediate crash, but if any callback that
>> dereferences p->pf were added to the unmanaged ops in the future it
>> would hit a NULL pointer dereference. Was the continue intended to
>> skip all of these assignments?
>

The last one is intended behavior, added a comment in v12.

Thank you!
Arkadiusz