Re: [PATCH v10 04/22] IB/hns: Add RoCE engine reset function
From: Leon Romanovsky
Date: Mon Jun 27 2016 - 04:01:52 EST
On Sat, Jun 25, 2016 at 06:25:37PM +0800, Wei Hu (Xavier) wrote:
>
>
> On 2016/6/24 22:59, Leon Romanovsky wrote:
> >On Thu, Jun 16, 2016 at 10:35:12PM +0800, Lijun Ou wrote:
> >>This patch mainly added reset flow of RoCE engine in RoCE
> >>driver. It is necessary when RoCE is loaded and removed.
> >>
> >>Signed-off-by: Wei Hu <xavier.huwei@xxxxxxxxxx>
> >>Signed-off-by: Nenglong Zhao <zhaonenglong@xxxxxxxxxxxxx>
> >>Signed-off-by: Lijun Ou <oulijun@xxxxxxxxxx>
> >>---
...
> >>+
> >>+#define SLEEP_TIME_INTERVAL 20
> >>+
> >>+extern int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool enable);
> >Why did you add this extern?
> >You already exported this function.
> >drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:EXPORT_SYMBOL(hns_dsaf_roce_reset);
> Hi, Leon
>
> The function named hns_dsaf_roce_reset is defined in hns_dsaf_main.c
> It exists in hns_dsaf.ko(ethernet driver)
>
> RoCE driver will call this function.
>
> Your suggestion is that delete "extern" as below:
> In /drivers/infiniband/hw/hns/hns_roce_hw_v1.h:
>
> int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool
> enable);
>
> Right? or other soultion?
You placed it in header file.
Please move it to your hns_roce_hw_v1.c file.
>
>
> Regards
> Wei Hu
> >>+
> >>+#endif
> >>diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
> >>index 8924ce3..d5ccce2 100644
> >>--- a/drivers/infiniband/hw/hns/hns_roce_main.c
> >>+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
> >>@@ -71,7 +71,9 @@ static int hns_roce_get_cfg(struct hns_roce_dev *hr_dev)
> >> struct platform_device *pdev = NULL;
> >> struct resource *res;
> >>- if (!of_device_is_compatible(np, "hisilicon,hns-roce-v1")) {
> >>+ if (of_device_is_compatible(np, "hisilicon,hns-roce-v1")) {
> >>+ hr_dev->hw = &hns_roce_hw_v1;
> >>+ } else {
> >> dev_err(dev, "device no compatible!\n");
> >> return -EINVAL;
> >> }
> >>@@ -118,6 +120,11 @@ static int hns_roce_get_cfg(struct hns_roce_dev *hr_dev)
> >> return 0;
> >> }
> >>+static int hns_roce_engine_reset(struct hns_roce_dev *hr_dev, bool enable)
> >>+{
> >>+ return hr_dev->hw->reset(hr_dev, enable);
> >>+}
> >>+
> >> /**
> >> * hns_roce_probe - RoCE driver entrance
> >> * @pdev: pointer to platform device
> >>@@ -156,6 +163,12 @@ static int hns_roce_probe(struct platform_device *pdev)
> >> goto error_failed_get_cfg;
> >> }
> >>+ ret = hns_roce_engine_reset(hr_dev, true);
> >Do you have better solution to sense device without doing full reset of
> >your hardware?
> Hi, Leon
>
> In this place, we need reset RoCEE engine to ensure that RoCE engine can
> work correctly.
> Hip06 Soc only support full reset RoCEE engine.
>
> Regards
> Wei Hu
>
> >
> >>+ if (ret) {
> >>+ dev_err(dev, "Reset roce engine failed!\n");
> >>+ goto error_failed_get_cfg;
> >>+ }
> >>+
> >> error_failed_get_cfg:
> >> ib_dealloc_device(&hr_dev->ib_dev);
> >>@@ -170,6 +183,8 @@ static int hns_roce_remove(struct platform_device *pdev)
> >> {
> >> struct hns_roce_dev *hr_dev = platform_get_drvdata(pdev);
> >>+ (void)hns_roce_engine_reset(hr_dev, false);
> >Any reason to do explicit casting?
> Hi, Leon
>
> /**
> * hns_roce_engine_reset - reset roce
> * @hr_dev: roce device struct pointer
> * @enable: true -- drop reset, false -- reset
> * return 0 - success , negative --fail
> */
> static int hns_roce_engine_reset(struct hns_roce_dev *hr_dev, bool enable);
>
> hns_roce_engine_reset->hns_roce_v1_reset->hns_dsaf_roce_reset
>
> The err branch of hns_roce_engine_reset as belowï
> int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool enable)
> {
> <...>
> if (!is_of_node(dsaf_fwnode)) {
> pr_err("hisi_dsaf: Only support DT node!\n");
> return -EINVAL;
> }
>
> pdev = of_find_device_by_node(to_of_node(dsaf_fwnode));
> dsaf_dev = dev_get_drvdata(&pdev->dev);
> if (AE_IS_VER1(dsaf_dev->dsaf_ver)) {
> dev_err(dsaf_dev->dev, "%s v1 chip do not support roce!\n",
> dsaf_dev->ae_dev.name);
> return -ENODEV;
> }
> <...>
> }
>
> When the cpu is processing hns_roce_engine_reset(hr_dev, false),
> hns_roce_engine_reset(hr_dev, true) have been alomost processed
> sucessfully.
> From the err branch of hns_roce_engine_reset, we found at this time
> hns_roce_engine_reset(hr_dev, true) could not return err.
>
> In hns_roce_remove function, we call hns_roce_engine_reset(hr_dev, false),
> and doesn't need to judge the return value.
Do you see any compilation warning for this part of code?
struct hns_roce_dev *hr_dev = platform_get_drvdata(pdev);
+ hns_roce_engine_reset(hr_dev, false);
instead of
struct hns_roce_dev *hr_dev = platform_get_drvdata(pdev);
+ (void)hns_roce_engine_reset(hr_dev, false);
Attachment:
signature.asc
Description: Digital signature