Re: [PATCH net-next v2] net: ethernet: ti: am65-cpsw/cpts: Fix CPTS release action

From: Roger Quadros
Date: Tue Jan 17 2023 - 04:29:56 EST


Siddharth,

On 17/01/2023 07:00, Siddharth Vadapalli wrote:
> Roger, Leon,
>
> On 16/01/23 21:31, Roger Quadros wrote:
>> Hi Siddharth,
>>
>> On 16/01/2023 09:43, Siddharth Vadapalli wrote:
>>>
>>>
>>> On 16/01/23 13:00, Leon Romanovsky wrote:
>>>> On Mon, Jan 16, 2023 at 10:15:17AM +0530, Siddharth Vadapalli wrote:
>>>>> The am65_cpts_release() function is registered as a devm_action in the
>>>>> am65_cpts_create() function in am65-cpts driver. When the am65-cpsw driver
>>>>> invokes am65_cpts_create(), am65_cpts_release() is added in the set of devm
>>>>> actions associated with the am65-cpsw driver's device.
>>>>>
>>>>> In the event of probe failure or probe deferral, the platform_drv_probe()
>>>>> function invokes dev_pm_domain_detach() which powers off the CPSW and the
>>>>> CPSW's CPTS hardware, both of which share the same power domain. Since the
>>>>> am65_cpts_disable() function invoked by the am65_cpts_release() function
>>>>> attempts to reset the CPTS hardware by writing to its registers, the CPTS
>>>>> hardware is assumed to be powered on at this point. However, the hardware
>>>>> is powered off before the devm actions are executed.
>>>>>
>>>>> Fix this by getting rid of the devm action for am65_cpts_release() and
>>>>> invoking it directly on the cleanup and exit paths.
>>>>>
>>>>> Fixes: f6bd59526ca5 ("net: ethernet: ti: introduce am654 common platform time sync driver")
>>>>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@xxxxxx>
>>>>> Reviewed-by: Roger Quadros <rogerq@xxxxxxxxxx>
>>>>> ---
>>>>> Changes from v1:
>>>>> 1. Fix the build issue when "CONFIG_TI_K3_AM65_CPTS" is not set. This
>>>>> error was reported by kernel test robot <lkp@xxxxxxxxx> at:
>>>>> https://lore.kernel.org/r/202301142105.lt733Lt3-lkp@xxxxxxxxx/
>>>>> 2. Collect Reviewed-by tag from Roger Quadros.
>>>>>
>>>>> v1:
>>>>> https://lore.kernel.org/r/20230113104816.132815-1-s-vadapalli@xxxxxx/
>>>>>
>>>>> drivers/net/ethernet/ti/am65-cpsw-nuss.c | 8 ++++++++
>>>>> drivers/net/ethernet/ti/am65-cpts.c | 15 +++++----------
>>>>> drivers/net/ethernet/ti/am65-cpts.h | 5 +++++
>>>>> 3 files changed, 18 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
>>>>> index 5cac98284184..00f25d8a026b 100644
>>>>> --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
>>>>> +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
>>>>> @@ -1913,6 +1913,12 @@ static int am65_cpsw_am654_get_efuse_macid(struct device_node *of_node,
>>>>> return 0;
>>>>> }
>>>>>
>>>>> +static void am65_cpsw_cpts_cleanup(struct am65_cpsw_common *common)
>>>>> +{
>>>>> + if (IS_ENABLED(CONFIG_TI_K3_AM65_CPTS) && common->cpts)
>>>>
>>>> Why do you have IS_ENABLED(CONFIG_TI_K3_AM65_CPTS), if
>>>> am65_cpts_release() defined as empty when CONFIG_TI_K3_AM65_CPTS not set?
>>>>
>>>> How is it possible to have common->cpts == NULL?
>>>
>>> Thank you for reviewing the patch. I realize now that checking
>>> CONFIG_TI_K3_AM65_CPTS is unnecessary.
>>>
>>> common->cpts remains NULL in the following cases:
>
> I realized that the cases I mentioned are not explained clearly. Therefore, I
> will mention the cases again, along with the section of code they correspond to,
> in order to make it clear.
>
> Case-1: am65_cpsw_init_cpts() returns 0 since CONFIG_TI_K3_AM65_CPTS is not
> enabled. This corresponds to the following section within am65_cpsw_init_cpts():
>
> if (!IS_ENABLED(CONFIG_TI_K3_AM65_CPTS))
> return 0;
>
> In this case, common->cpts remains NULL, but it is not a problem even if the
> am65_cpsw_nuss_probe() fails later, since the am65_cpts_release() function is
> NOP. Thus, this case is not an issue.
>
> Case-2: am65_cpsw_init_cpts() returns -ENOENT since the cpts node is not present
> in the device tree. This corresponds to the following section within
> am65_cpsw_init_cpts():
>
> node = of_get_child_by_name(dev->of_node, "cpts");
> if (!node) {
> dev_err(dev, "%s cpts not found\n", __func__);
> return -ENOENT;
> }
>
> In this case as well, common->cpts remains NULL, but it is not a problem because
> the probe fails and the execution jumps to "err_of_clear", which doesn't invoke
> am65_cpsw_cpts_cleanup(). Therefore, common->cpts being NULL is not a problem.
>
> Case-3 and Case-4 are described later in this mail.
>
>>> 1. am65_cpsw_init_cpts() returns 0 since CONFIG_TI_K3_AM65_CPTS is not enabled.
>>> 2. am65_cpsw_init_cpts() returns -ENOENT since the cpts node is not defined.
>>> 3. The call to am65_cpts_create() fails within the am65_cpsw_init_cpts()
>>> function with a return value of 0 when cpts is disabled.
>>
>> In this case common->cpts is not NULL and is set to error pointer.
>> Probe will continue normally.
>> Is it OK to call any of the cpts APIs with invalid handle?
>> Also am65_cpts_release() will be called with invalid handle.
>
> Yes Roger, thank you for pointing it out. When I wrote "cpts is disabled", I had
> meant that the following section is executed within the am65_cpsw_init_cpts()
> function:
>
> Case-3:
>
> cpts = am65_cpts_create(dev, reg_base, node);
> if (IS_ERR(cpts)) {
> int ret = PTR_ERR(cpts);
>
> of_node_put(node);
> if (ret == -EOPNOTSUPP) {
> dev_info(dev, "cpts disabled\n");
> return 0;
> }
>
> ......
> }
>
> Leon,
>
> In the above code, when the section corresponding to:
> dev_info(dev, "cpts disabled\n");
>
> is executed, CONFIG_TI_K3_AM65_CPTS is enabled. Therefore, the
> am65_cpts_release() is not NOP. If the probe fails after the call to
> am65_cpsw_init_cpts(), then the am65_cpsw_cpts_cleanup() function will be called
> in the cleanup path of probe, which needs to check for common->cpts not being
> NULL. This is because common->cpts is NULL after returning 0 from the
> am65_cpsw_init_cpts() function at the
> dev_info(dev, "cpts disabled\n");
>
> section. Thus, I believe that in this case, am65_cpts_release() shouldn't be
> invoked from the am65_cpsw_cpts_cleanup() function, since it would have already
> been invoked from am65_cpts_create()'s cleanup path. This can be ensured by
> checking whether common->cpts is NULL or not, before invoking
> am65_cpts_release() within am65_cpsw_cpts_cleanup().
>

Yes, I agree.

>>
>>> 4. The call to am65_cpts_create() within the am65_cpsw_init_cpts() function
>>> fails with an error.
>>
>> In this case common->cpts is not NULL and will invoke am65_cpts_release() with
>> invalid handle.
>
> Case-4: The call to am65_cpts_create() within the am65_cpsw_init_cpts() function
> fails with an error. This corresponds to the following section within
> am65_cpsw_init_cpts():
>
> cpts = am65_cpts_create(dev, reg_base, node);
> if (IS_ERR(cpts)) {
> ......
> dev_err(dev, "cpts create err %d\n", ret);
> return ret;
> }
>
>
> Roger,
>
> If the call to am65_cpts_create() fails with an error other than -EOPNOTSUPP,
> which corresponds to Case-4, the call to am65_cpts_release() would have been
> invoked within the am65_cpts_create()'s cleanup path itself if necessary. Also,
> when the error is not -EOPNOTSUPP, the am65_cpsw_init_cpts() function returns an
> error, due to which the execution jumps to "err_of_clear" in
> am65_cpsw_nuss_probe(). Therefore, am65_cpsw_cpts_cleanup() is not invoked in
> this case, due to which common->cpts being NULL is not a problem.

Correct.

>
>
> Roger, Leon, please review my comments and let me know. I think that Case-3
> demands checking whether common->cpts is NULL or not, within the
> am65_cpsw_cpts_cleanup() function.

Do you really need a separate am65_cpsw_cpts_cleanup() or can just add
the NULL check in am65_cpts_release()?

cheers,
-roger