Re: [PATCH 07/11] cxl/mem: Trace Memory Module Event Record

From: Ira Weiny
Date: Wed Nov 30 2022 - 04:30:30 EST


On Thu, Nov 17, 2022 at 11:22:35AM +0000, Jonathan Cameron wrote:
> On Wed, 16 Nov 2022 17:23:58 -0800
> Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
>
> > On Wed, Nov 16, 2022 at 03:35:28PM +0000, Jonathan Cameron wrote:
> > > On Thu, 10 Nov 2022 10:57:54 -0800
> > > ira.weiny@xxxxxxxxx wrote:
> > >
> > > > From: Ira Weiny <ira.weiny@xxxxxxxxx>
> > > >
> > > > CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
> > > >
> > > > Determine if the event read is memory module record and if so trace the
> > > > record.
> > > >
> > > > Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
> > > >
> > > Noticed that we have a mixture of fully capitalized and not for flags.
> > > With that either explained or tidied up:
> > >
> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > >
> > > > +/*
> > > > + * Device Health Information - DHI
> > > > + *
> > > > + * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
> > > > + */
> > > > +#define CXL_DHI_HS_MAINTENANCE_NEEDED BIT(0)
> > > > +#define CXL_DHI_HS_PERFORMANCE_DEGRADED BIT(1)
> > > > +#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED BIT(2)
> > > > +#define show_health_status_flags(flags) __print_flags(flags, "|", \
> > > > + { CXL_DHI_HS_MAINTENANCE_NEEDED, "Maintenance Needed" }, \
> > > > + { CXL_DHI_HS_PERFORMANCE_DEGRADED, "Performance Degraded" }, \
> > > > + { CXL_DHI_HS_HW_REPLACEMENT_NEEDED, "Replacement Needed" } \
> > >
> > > Why are we sometime using capitals for flags (e.g patch 5) and not other times?
> >
> > Not sure what you mean. Do you mean this from patch 5?
> Nope
>
> +#define CXL_DPA_VOLATILE BIT(0)
> +#define CXL_DPA_NOT_REPAIRABLE BIT(1)
> +#define show_dpa_flags(flags) __print_flags(flags, "|", \
> + { CXL_DPA_VOLATILE, "VOLATILE" }, \
> + { CXL_DPA_NOT_REPAIRABLE, "NOT_REPAIRABLE" } \
> +)
> +
>
> Where they are all capitals. I thought that was maybe a flags vs other fields
> thing but it doesn't seem to be.

I've not made all flags capital on this and other patches.

Ira

>
>
> >
> > ...
> > { CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT, "Uncorrectable Event" }, \
> > { CXL_GMER_EVT_DESC_THRESHOLD_EVENT, "Threshold event" }, \
> > { CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW, "Poison List Overflow" } \
> > ...
> >
> > Threshold event was a mistake. This is the capitalization the spec uses.
> >
> > Bit[0]: Uncorrectable Event: When set, indicates the reported event is
> > ^^^^^^^^^^^^^^^^^^^
> > uncorrectable by the device. When cleared, indicates the reported
> > event was corrected by the device.
> >
> > Bit[1]: Threshold Event: When set, the event is the result of a
> > ^^^^^^^^^^^^^^^
> > threshold on the device having been reached. When cleared, the event
> > is not the result of a threshold limit.
> >
> > Bit[2]: Poison List Overflow Event: When set, the Poison List has
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^
> > overflowed, and this event is not in the Poison List. When cleared, the
> > Poison List has not overflowed.
> >
> >
> > I'll update this 'Event' in patch 5. Probably need to add 'Event' to the
> > Poison List...
> >
> > Ira
>