Re: [PATCH v10 04/22] IB/hns: Add RoCE engine reset function
From: oulijun
Date: Mon Jun 27 2016 - 04:32:21 EST
Hi, Leon
å 2016/6/27 16:01, Leon Romanovsky åé:
> On Sat, Jun 25, 2016 at 06:25:37PM +0800, Wei Hu (Xavier) wrote:
>>
>>
>> On 2016/6/24 22:59, Leon Romanovsky wrote:
>>> On Thu, Jun 16, 2016 at 10:35:12PM +0800, Lijun Ou wrote:
>>>> This patch mainly added reset flow of RoCE engine in RoCE
>>>> driver. It is necessary when RoCE is loaded and removed.
>>>>
>>>> Signed-off-by: Wei Hu <xavier.huwei@xxxxxxxxxx>
>>>> Signed-off-by: Nenglong Zhao <zhaonenglong@xxxxxxxxxxxxx>
>>>> Signed-off-by: Lijun Ou <oulijun@xxxxxxxxxx>
>>>> ---
>
> ...
>
>>>> +
>>>> +#define SLEEP_TIME_INTERVAL 20
>>>> +
>>>> +extern int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool enable);
>>> Why did you add this extern?
>>> You already exported this function.
>>> drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:EXPORT_SYMBOL(hns_dsaf_roce_reset);
>> Hi, Leon
>>
>> The function named hns_dsaf_roce_reset is defined in hns_dsaf_main.c
>> It exists in hns_dsaf.ko(ethernet driver)
>>
>> RoCE driver will call this function.
>>
>> Your suggestion is that delete "extern" as below:
>> In /drivers/infiniband/hw/hns/hns_roce_hw_v1.h:
>>
>> int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool
>> enable);
>>
>> Right? or other soultion?
>
> You placed it in header file.
> Please move it to your hns_roce_hw_v1.c file.
>
You suggest to do as follows, right?
in hns_roce_hw_v1.c
int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool enable);
and delete the keyword extern
Bcause reserve the extern in hns_roce_hw_v1.c, the checkpatch is not pass.
>>
>>
>> Regards
>> Wei Hu
>>>> +
>>>> +#endif
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
>>>> index 8924ce3..d5ccce2 100644
>>>> --- a/drivers/infiniband/hw/hns/hns_roce_main.c
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_main.c
>>>> @@ -71,7 +71,9 @@ static int hns_roce_get_cfg(struct hns_roce_dev *hr_dev)
>>>> struct platform_device *pdev = NULL;
>>>> struct resource *res;
>>>> - if (!of_device_is_compatible(np, "hisilicon,hns-roce-v1")) {
>>>> + if (of_device_is_compatible(np, "hisilicon,hns-roce-v1")) {
>>>> + hr_dev->hw = &hns_roce_hw_v1;
>>>> + } else {
>>>> dev_err(dev, "device no compatible!\n");
>>>> return -EINVAL;
>>>> }
>>>> @@ -118,6 +120,11 @@ static int hns_roce_get_cfg(struct hns_roce_dev *hr_dev)
>>>> return 0;
>>>> }
>>>> +static int hns_roce_engine_reset(struct hns_roce_dev *hr_dev, bool enable)
>>>> +{
>>>> + return hr_dev->hw->reset(hr_dev, enable);
>>>> +}
>>>> +
>>>> /**
>>>> * hns_roce_probe - RoCE driver entrance
>>>> * @pdev: pointer to platform device
>>>> @@ -156,6 +163,12 @@ static int hns_roce_probe(struct platform_device *pdev)
>>>> goto error_failed_get_cfg;
>>>> }
>>>> + ret = hns_roce_engine_reset(hr_dev, true);
>>> Do you have better solution to sense device without doing full reset of
>>> your hardware?
>> Hi, Leon
>>
>> In this place, we need reset RoCEE engine to ensure that RoCE engine can
>> work correctly.
>> Hip06 Soc only support full reset RoCEE engine.
>>
>> Regards
>> Wei Hu
>>
>>>
>>>> + if (ret) {
>>>> + dev_err(dev, "Reset roce engine failed!\n");
>>>> + goto error_failed_get_cfg;
>>>> + }
>>>> +
>>>> error_failed_get_cfg:
>>>> ib_dealloc_device(&hr_dev->ib_dev);
>>>> @@ -170,6 +183,8 @@ static int hns_roce_remove(struct platform_device *pdev)
>>>> {
>>>> struct hns_roce_dev *hr_dev = platform_get_drvdata(pdev);
>>>> + (void)hns_roce_engine_reset(hr_dev, false);
>>> Any reason to do explicit casting?
>> Hi, Leon
>>
>> /**
>> * hns_roce_engine_reset - reset roce
>> * @hr_dev: roce device struct pointer
>> * @enable: true -- drop reset, false -- reset
>> * return 0 - success , negative --fail
>> */
>> static int hns_roce_engine_reset(struct hns_roce_dev *hr_dev, bool enable);
>>
>> hns_roce_engine_reset->hns_roce_v1_reset->hns_dsaf_roce_reset
>>
>> The err branch of hns_roce_engine_reset as belowï
>> int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool enable)
>> {
>> <...>
>> if (!is_of_node(dsaf_fwnode)) {
>> pr_err("hisi_dsaf: Only support DT node!\n");
>> return -EINVAL;
>> }
>>
>> pdev = of_find_device_by_node(to_of_node(dsaf_fwnode));
>> dsaf_dev = dev_get_drvdata(&pdev->dev);
>> if (AE_IS_VER1(dsaf_dev->dsaf_ver)) {
>> dev_err(dsaf_dev->dev, "%s v1 chip do not support roce!\n",
>> dsaf_dev->ae_dev.name);
>> return -ENODEV;
>> }
>> <...>
>> }
>>
>> When the cpu is processing hns_roce_engine_reset(hr_dev, false),
>> hns_roce_engine_reset(hr_dev, true) have been alomost processed
>> sucessfully.
>> From the err branch of hns_roce_engine_reset, we found at this time
>> hns_roce_engine_reset(hr_dev, true) could not return err.
>>
>> In hns_roce_remove function, we call hns_roce_engine_reset(hr_dev, false),
>> and doesn't need to judge the return value.
>
> Do you see any compilation warning for this part of code?
>
> struct hns_roce_dev *hr_dev = platform_get_drvdata(pdev);
> + hns_roce_engine_reset(hr_dev, false);
>
> instead of
>
> struct hns_roce_dev *hr_dev = platform_get_drvdata(pdev);
> + (void)hns_roce_engine_reset(hr_dev, false);
>
No warning.
However, the result of PClint checking is error, because the hns_roce_engine_reset have return value.
thanks
Lijun Ou