Re: [PATCH v2 net-next 3/3] octeontx2-af: Add devlink health reporters for NIX
From: Saeed Mahameed
Date: Thu Nov 05 2020 - 00:08:57 EST
On Wed, 2020-11-04 at 17:57 +0530, George Cherian wrote:
> Add health reporters for RVU NPA block.
^^^ NIX ?
Cc: Jiri
Anyway, could you please spare some words on what is NPA and what is
NIX?
Regarding the reporters names, all drivers register well known generic
names such as (fw,hw,rx,tx), I don't know if it is a good idea to use
vendor specific names, if you are reporting for hw/fw units then just
use "hw" or "fw" as the reporter name and append the unit NPA/NIX to
the counter/error names.
> Only reporter dump is supported.
>
> Output:
> # ./devlink health
> pci/0002:01:00.0:
> reporter npa
> state healthy error 0 recover 0
> reporter nix
> state healthy error 0 recover 0
> # ./devlink health dump show pci/0002:01:00.0 reporter nix
> NIX_AF_GENERAL:
> Memory Fault on NIX_AQ_INST_S read: 0
> Memory Fault on NIX_AQ_RES_S write: 0
> AQ Doorbell error: 0
> Rx on unmapped PF_FUNC: 0
> Rx multicast replication error: 0
> Memory fault on NIX_RX_MCE_S read: 0
> Memory fault on multicast WQE read: 0
> Memory fault on mirror WQE read: 0
> Memory fault on mirror pkt write: 0
> Memory fault on multicast pkt write: 0
> NIX_AF_RAS:
> Poisoned data on NIX_AQ_INST_S read: 0
> Poisoned data on NIX_AQ_RES_S write: 0
> Poisoned data on HW context read: 0
> Poisoned data on packet read from mirror buffer: 0
> Poisoned data on packet read from mcast buffer: 0
> Poisoned data on WQE read from mirror buffer: 0
> Poisoned data on WQE read from multicast buffer: 0
> Poisoned data on NIX_RX_MCE_S read: 0
> NIX_AF_RVU:
> Unmap Slot Error: 0
>
Now i am a little bit skeptic here, devlink health reporter
infrastructure was never meant to deal with dump op only, the main
purpose is to diagnose/dump and recover.
especially in your use case where you only report counters, i don't
believe devlink health dump is a proper interface for this.
Many of these counters if not most are data path packet based and maybe
they should belong to ethtool.