Re: [PATCH] mlx4_ib: Increase the timeout for CM cache

From: HÃkon Bugge
Date: Wed Feb 06 2019 - 13:17:04 EST




> On 6 Feb 2019, at 19:02, jackm <jackm@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, 6 Feb 2019 16:40:14 +0100
> HÃkon Bugge <haakon.bugge@xxxxxxxxxx> wrote:
>
>> Jack,
>>
>> A major contributor to the long processing time in the PF driver
>> proxying QP1 packets is:
>>
>> create_pv_resources
>> -> ib_create_cq(ctx->ib_dev, mlx4_ib_tunnel_comp_handler,
>> NULL, ctx, cq_size, 0);
>>
>> That is, comp_vector is zero.
>>
>> Due to commit 6ba1eb776461 ("IB/mlx4: Scatter CQs to different EQs"),
>> the zero comp_vector has the intent of let the mlx4_core driver
>> select the least used vector.
>>
>> But, in mlx4_ib_create_cq(), we have:
>>
>> pr_info("eq_table: %p\n", dev->eq_table);
>> if (dev->eq_table) {
>> vector = dev->eq_table[mlx4_choose_vector(dev->dev,
>> vector, ibdev->num_comp_vectors)];
>> }
>>
>> cq->vector = vector;
>>
>> and dev->eq_table is NULL, so all the CQs for the proxy QPs get
>> comp_vector zero.
>>
>> I have to make some reservations, as this analysis is based on uek4,
>> but I think the code here is equal upstream, but need to double check.
>>
>>
>> Thxs, HÃkon
>>
> Hi Hakon and Jason,
> I was ill today (bad cold, took antihistamines all day, which knocked
> me out).
> I'll get to this tomorrow.

No problem Jack. I actually see that our uek4 is different in mlx4_ib_alloc_eqs(), and that may well be the root cause here.

Hence, moved the MLs to Bcc, and get back to you tomorrow.


Thxs, HÃkon