Re: [Intel-wired-lan] [PATCH iwl-next v2] libie: log more info when virtchnl fails

From: Li Li

Date: Wed Apr 29 2026 - 16:13:38 EST


On Mon, Apr 27, 2026 at 2:06 AM Loktionov, Aleksandr
<aleksandr.loktionov@xxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Intel-wired-lan <intel-wired-lan-bounces@xxxxxxxxxx> On Behalf
> > Of Li Li via Intel-wired-lan
> > Sent: Saturday, April 25, 2026 3:49 AM
> > To: Nguyen, Anthony L <anthony.l.nguyen@xxxxxxxxx>; Kitszel,
> > Przemyslaw <przemyslaw.kitszel@xxxxxxxxx>; David S. Miller
> > <davem@xxxxxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Eric Dumazet
> > <edumazet@xxxxxxxxxx>; intel-wired-lan@xxxxxxxxxxxxxxxx
> > Cc: netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; David
> > Decotigny <decot@xxxxxxxxxx>; Singhai, Anjali
> > <anjali.singhai@xxxxxxxxx>; Samudrala, Sridhar
> > <sridhar.samudrala@xxxxxxxxx>; Brian Vazquez <brianvv@xxxxxxxxxx>; Li
> > Li <boolli@xxxxxxxxxx>; Tantilov, Emil S <emil.s.tantilov@xxxxxxxxx>
> > Subject: [Intel-wired-lan] [PATCH iwl-next v2] libie: log more info
> > when virtchnl fails
> >
> > Virtchnl failures can be hard to debug without logs. Logging the
> > details of virtchnl transactions can be useful for debugging virtchnl-
> > related issues.
> >
> > Tested: Built and booted on a test machine.
> >
> > Signed-off-by: Li Li <boolli@xxxxxxxxxx>
> > ---
> > v2:
> > - Use dev_warn_ratelimited instead of dev_notice_ratelimited based on
> > reviewer feedback.
> >
> > drivers/net/ethernet/intel/libie/controlq.c | 15 +++++++++++++++
> > 1 file changed, 15 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/libie/controlq.c
> > b/drivers/net/ethernet/intel/libie/controlq.c
> > index ebc05355e39d..bf200fea1e12 100644
> > --- a/drivers/net/ethernet/intel/libie/controlq.c
> > +++ b/drivers/net/ethernet/intel/libie/controlq.c
> > @@ -762,6 +762,16 @@ libie_ctlq_xn_process_recv(struct
> > libie_ctlq_xn_recv_params *params,
> > status = ctlq_msg->chnl_retval ? -EFAULT : 0;
> >
> > xn = &xnm->ring[xn_index];
> > +
> > + if (ctlq_msg->chnl_retval) {
> > + dev_err_ratelimited(
> > + params->ctlq->dev,
> > + "Non-zero virtchnl ret val (msg op: %u, ret val:
> > %u, msg_cookie: %u, data_len: %u); xn op: %u, id: %u, cookie: %u\n",
> > + ctlq_msg->chnl_opcode, ctlq_msg->chnl_retval,
> > + msg_cookie, ctlq_msg->data_len, xn-
> > >virtchnl_opcode,
> > + xn->index, xn->cookie);
> Dangerous! You add xn logging before it's fields validation.

I intentionally added the log before the field validation because I
wanted to surface corrupted or timed out virtchnl replies. If we log
it only after the field validation, we won't have the visibility into
the corrupted / timed-out virtchnl replies.

But so far the bugs we have seen in our internal OOT driver all
appeared after the field validation, so I'm fine with moving the log
to after the validation as well. I will do it in the v3 patch.

>
>
> > + }
> > +
> > if (ctlq_msg->chnl_opcode != xn->virtchnl_opcode ||
> > msg_cookie != xn->cookie)
> > return false;
> > @@ -1011,6 +1021,11 @@ int libie_ctlq_xn_send(struct
> > libie_ctlq_xn_send_params *params)
> > params->recv_mem = xn->recv_mem;
> > break;
> > default:
> > + dev_warn_ratelimited(
> > + params->ctlq->dev,
> > + "Transaction failed (op %u, xn state: %d, id: %u,
> > cookie: %u, size: %zu)\n",
> > + params->chnl_opcode, xn->state, xn->index, xn-
> > >cookie,
> > + xn->recv_mem.iov_len);
> > ret = -EBADMSG;
> > break;
> > }
> > --
> > 2.54.0.rc2.544.gc7ae2d5bb8-goog
>