[PATCH v2 0/6] Address Translation support for MI200 and MI300 models

From: Muralidhara M K
Date: Wed Nov 29 2023 - 02:35:59 EST


From: Muralidhara M K <muralidhara.mk@xxxxxxx>

This patchset adds support for MI200 heterogeneous address translation support
and MI300A address translation support, Few fixups on HBM3 memory address maps to
convert on-die(MCA decoded) address to Normalized address.

The patch set depends on the Yazen's patches submitted "AMD Address Translation Library"
https://lore.kernel.org/r/20231005173526.42831-1-yazen.ghannam@xxxxxxx

The patchset does the following

Patch 1:
MI200 heterogeneous address translation support.

Patch 2:
MI300 heterogeneous address translation support.

Patch 3:
Convert HBM3 MCA Decoded address to Normalized address.

Patch 4:
lookup table to get the correct cs instance id for HBM3.

Patch 5:
Convert physical cs id to logical cs id by static lookup
table.

Patch 6:
Identify all 8 column system physical addresses from each HBM3 row and retire all
column addresses when the error is injected to avoid future errors.

Muralidhara M K (6):
RAS: Add Address Translation support for MI200
RAS: Add Address Translation support for MI300
RAS: Add MCA Error address conversion for UMC
RAS: Add static lookup table to get CS physical ID
RAS: Add fixed Physical to logical CS ID mapping table
RAS: EDAC/amd64: Retire all system physical address from HBM3 row

drivers/edac/amd64_edac.c | 3 +
drivers/ras/amd/atl/core.c | 5 +-
drivers/ras/amd/atl/dehash.c | 149 ++++++++++++++++
drivers/ras/amd/atl/denormalize.c | 110 +++++++++++-
drivers/ras/amd/atl/internal.h | 27 ++-
drivers/ras/amd/atl/map.c | 158 ++++++++++++++---
drivers/ras/amd/atl/reg_fields.h | 34 ++++
drivers/ras/amd/atl/system.c | 4 +
drivers/ras/amd/atl/umc.c | 284 +++++++++++++++++++++++++++++-
include/linux/amd-atl.h | 2 +
10 files changed, 747 insertions(+), 29 deletions(-)

--
2.25.1