Re: [PATCH v18 04/19] EDAC: Add memory repair control feature

From: Mauro Carvalho Chehab
Date: Tue Jan 14 2025 - 08:47:30 EST


Em Mon, 6 Jan 2025 12:10:00 +0000
<shiju.jose@xxxxxxxxxx> escreveu:

> +What: /sys/bus/edac/devices/<dev-name>/mem_repairX/repair_function
> +Date: Jan 2025
> +KernelVersion: 6.14
> +Contact: linux-edac@xxxxxxxxxxxxxxx
> +Description:
> + (RO) Memory repair function type. For eg. post package repair,
> + memory sparing etc.
> + EDAC_SOFT_PPR - Soft post package repair
> + EDAC_HARD_PPR - Hard post package repair
> + EDAC_CACHELINE_MEM_SPARING - Cacheline memory sparing
> + EDAC_ROW_MEM_SPARING - Row memory sparing
> + EDAC_BANK_MEM_SPARING - Bank memory sparing
> + EDAC_RANK_MEM_SPARING - Rank memory sparing
> + All other values are reserved.
> +
> +What: /sys/bus/edac/devices/<dev-name>/mem_repairX/persist_mode
> +Date: Jan 2025
> +KernelVersion: 6.14
> +Contact: linux-edac@xxxxxxxxxxxxxxx
> +Description:
> + (RW) Read/Write the current persist repair mode set for a
> + repair function. Persist repair modes supported in the
> + device, based on the memory repair function is temporary
> + or permanent and is lost with a power cycle.
> + EDAC_MEM_REPAIR_SOFT - Soft repair function (temporary repair).
> + EDAC_MEM_REPAIR_HARD - Hard memory repair function (permanent repair).
> + All other values are reserved.
> +

After re-reading some things, I suspect that the above can be simplified
a little bit by folding soft/hard PPR into a single element at
/repair_function, and letting it clearer that persist_mode is valid only
for PPR (I think this is the case, right?), e.g. something like:

What: /sys/bus/edac/devices/<dev-name>/mem_repairX/repair_function
...
Description:
(RO) Memory repair function type. For e.g. post package repair,
memory sparing etc. Valid values are:

- ppr - post package repair.
Please define its mode via
/sys/bus/edac/devices/<dev-name>/mem_repairX/persist_mode
- cacheline-sparing - Cacheline memory sparing
- row-sparing - Row memory sparing
- bank-sparing - Bank memory sparing
- rank-sparing - Rank memory sparing
- All other values are reserved.

and define persist_mode in a different way:

What: /sys/bus/edac/devices/<dev-name>/mem_repairX/ppr_persist_mode
...
Description:
(RW) Read/Write the current persist repair (PPR) mode set for a
post package repair function. Persist repair modes supported
in the device, based on the memory repair function is temporary
or permanent and is lost with a power cycle. Valid values are:

- repair-soft - Soft PPR function (temporary repair).
- repair-hard - Hard memory repair function (permanent repair).
- All other values are reserved.

Thanks,
Mauro