Re: [PATCH v3 2/2] PCI/AER: Split the AER stats into multiple sysfs attributes

From: Greg KH
Date: Wed Aug 28 2019 - 05:30:10 EST


On Tue, Aug 27, 2019 at 03:21:45PM -0700, Rajat Jain wrote:
> Split the AER stats into multiple sysfs atributes. Note that
> this changes the ABI of the AER stats, but hopefully, there
> aren't active users that need to change. This is how the AERs
> are being exposed now:
>
> localhost /sys/devices/pci0000:00/0000:00:1c.0/aer_stats # ls -l
> total 0
> -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit0_RxErr
> -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit12_Timeout
> -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit13_NonFatalErr
> -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit14_CorrIntErr
> -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit15_HeaderOF
> -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit6_BadTLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit7_BadDLLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit8_Rollover
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit0_Undefined
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit12_TLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit13_FCP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit14_CmpltTO
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit15_CmpltAbrt
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit16_UnxCmplt
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit17_RxOF
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit18_MalfTLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit19_ECRC
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit20_UnsupReq
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit21_ACSViol
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit22_UncorrIntErr
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit23_BlockedTLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit24_AtomicOpBlocked
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit25_TLPBlockedErr
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit26_PoisonTLPBlocked
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit4_DLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit5_SDES
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit0_Undefined
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit12_TLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit13_FCP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit14_CmpltTO
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit15_CmpltAbrt
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit16_UnxCmplt
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit17_RxOF
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit18_MalfTLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit19_ECRC
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit20_UnsupReq
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit21_ACSViol
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit22_UncorrIntErr
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit23_BlockedTLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit24_AtomicOpBlocked
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit25_TLPBlockedErr
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit26_PoisonTLPBlocked
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit4_DLP
> -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit5_SDES
> -r--r--r--. 1 root root 4096 Aug 20 16:35 total_device_err_cor
> -r--r--r--. 1 root root 4096 Aug 20 16:35 total_device_err_fatal
> -r--r--r--. 1 root root 4096 Aug 20 16:35 total_device_err_nonfatal
> -r--r--r--. 1 root root 4096 Aug 20 16:35 total_rootport_err_cor
> -r--r--r--. 1 root root 4096 Aug 20 16:35 total_rootport_err_fatal
> -r--r--r--. 1 root root 4096 Aug 20 16:35 total_rootport_err_nonfatal
> localhost /sys/devices/pci0000:00/0000:00:1c.0/aer_stats #
>
> Each file is has a single counter value. Single file containing all
> stats was frowned upon and discussed here:
> https://lkml.org/lkml/2019/6/28/220
>
> Signed-off-by: Rajat Jain <rajatja@xxxxxxxxxx>
> ---
> v3: indent the sysfs attribute names in documentation.
> v2: Also change the Documentation
>
> .../testing/sysfs-bus-pci-devices-aer_stats | 160 ++++++++---------
> drivers/pci/pcie/aer.c | 166 +++++++++++++-----
> 2 files changed, 191 insertions(+), 135 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats b/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> index 3c9a8c4a25eb..8cd93acddf76 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> +++ b/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> @@ -9,89 +9,72 @@ errors may be "seen" / reported by the link partner and not the
> problematic endpoint itself (which may report all counters as 0 as it never
> saw any problems).
>
> -What: /sys/bus/pci/devices/<dev>/aer_dev_correctable
> -Date: July 2018
> -KernelVersion: 4.19.0
> +What: Following files in /sys/bus/pci/devices/<dev>/aer_stats/
> + correctable_bit0_RxErr
> + correctable_bit12_Timeout
> + correctable_bit13_NonFatalErr
> + correctable_bit14_CorrIntErr
> + correctable_bit15_HeaderOF
> + correctable_bit6_BadTLP
> + correctable_bit7_BadDLLP
> + correctable_bit8_Rollover
> + fatal_bit0_Undefined
> + fatal_bit12_TLP
> + fatal_bit13_FCP
> + fatal_bit14_CmpltTO
> + fatal_bit15_CmpltAbrt
> + fatal_bit16_UnxCmplt
> + fatal_bit17_RxOF
> + fatal_bit18_MalfTLP
> + fatal_bit19_ECRC
> + fatal_bit20_UnsupReq
> + fatal_bit21_ACSViol
> + fatal_bit22_UncorrIntErr
> + fatal_bit23_BlockedTLP
> + fatal_bit24_AtomicOpBlocked
> + fatal_bit25_TLPBlockedErr
> + fatal_bit26_PoisonTLPBlocked
> + fatal_bit4_DLP
> + fatal_bit5_SDES
> + nonfatal_bit0_Undefined
> + nonfatal_bit12_TLP
> + nonfatal_bit13_FCP
> + nonfatal_bit14_CmpltTO
> + nonfatal_bit15_CmpltAbrt
> + nonfatal_bit16_UnxCmplt
> + nonfatal_bit17_RxOF
> + nonfatal_bit18_MalfTLP
> + nonfatal_bit19_ECRC
> + nonfatal_bit20_UnsupReq
> + nonfatal_bit21_ACSViol
> + nonfatal_bit22_UncorrIntErr
> + nonfatal_bit23_BlockedTLP
> + nonfatal_bit24_AtomicOpBlocked
> + nonfatal_bit25_TLPBlockedErr
> + nonfatal_bit26_PoisonTLPBlocked
> + nonfatal_bit4_DLP
> + nonfatal_bit5_SDES

{sigh}

Does this look good to you? There's a whole lot of alignment in the
original tags here, and you are not doing that here at all.

Yes, this is a trivial complaint, but please, these should be easy to
read and understand, and you aren't makeing it that here...


> +Date: Aug 2019
> +KernelVersion: 5.3.0

I do not think this will hit 5.3, right?

thanks,

greg k-h