Re: [PATCH 13/13] rasdaemon: ras-mc-ctl: Update logging of CXL memory module data to align with CXL spec rev 3.1

From: Jonathan Cameron
Date: Thu Nov 21 2024 - 10:39:46 EST


On Wed, 20 Nov 2024 09:59:23 +0000
<shiju.jose@xxxxxxxxxx> wrote:

> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>
> CXL spec 3.1 section 8.2.9.2.1.3 Table 8-47, Memory Module Event Record
> has updated with following new fields and new info for Device Event Type
> and Device Health Information fields.
> 1. Validity Flags
> 2. Component Identifier
> 3. Device Event Sub-Type
>
> This update modifies ras-mc-ctl to parse and log CXL memory module event
> data stored in the RAS SQLite database table, reflecting the
> specification changes introduced in revision 3.1.
>
> Example output,
>
> ./util/ras-mc-ctl --errors
> ...
> CXL memory module events:
> 1 2024-11-20 00:22:33 +0000 error: memdev=mem0, host=0000:0f:00.0, serial=0x3, \
> log=Fatal, hdr_uuid=fe927475-dd59-4339-a586-79bab113b774, hdr_flags=0x1, , \
> hdr_handle=0x1, hdr_related_handle=0x0, hdr_timestamp=1970-01-01 00:04:38 +0000, \
> hdr_length=128, hdr_maint_op_class=0, hdr_maint_op_sub_class=1, \
> event_type: Temperature Change, event_sub_type: Unsupported Config Data, \
> health_status: 'MAINTENANCE_NEEDED' , 'REPLACEMENT_NEEDED' , \
> media_status: All Data Loss in Event of Power Loss, life_used=8, \
> dirty_shutdown_cnt=33, cor_vol_err_cnt=25, cor_per_err_cnt=45, \
> device_temp=3, add_status=3 \
> component_id:02 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \
> pldm_entity_id:00 00 00 00 00 00 pldm_resource_id:fc d2 7e 2f
> ...
>
> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
Feels like there is a lot of duplication in here, but you aren't
really making it any worse and maybe it is hard to reduce it.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>