[PATCH v4 0/5] CXL Poison List Retrieval & Tracing

From: alison . schofield
Date: Thu Dec 15 2022 - 16:18:08 EST


From: Alison Schofield <alison.schofield@xxxxxxxxx>

Changes in v4:
- Rebase on cxl/preview
- Squash 2 mock patches into 1 mock patch
- Apply Jonathan Reviewed-by tags on Patches 1,2,4,5
- Don't return an error on failure to read volatile range poison (Jonathan)
- Use strong types in trace event arguments supplying dev_names (Dan)
- Pass the media-error record structure to trace event. (Steve, Ira)
- Re-order Patches 1 & 2 to make the change above work
- Use a poison state struct to hold buffer, lock (and max_mer) (Dan)
- Allocate the poison list payload buffer once (Dan)
- Request poison length in multiples of 64 bytes per CXL Spec
- Test for enabled when storing Identify commands max_mer
- Use get_unaligned_le24() on poison max_mer (Jonathan)
- Use decimal values for size (rsvd[20]) (Dan)
- cxl_test: mock with a valid DPA address
- s/includes/'consists of' (Jonathan)

Link to v3:
https://lore.kernel.org/linux-cxl/cover.1668115235.git.alison.schofield@xxxxxxxxx/

Add support for retrieving device poison lists and store the returned
error records as kernel trace events.

The handling of the poison list is guided by the CXL 3.0 Specification
Section 8.2.9.8.4.1. [1]

Example, triggered by memdev:
$ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0

Example, triggered by region:
$ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0

[1]: https://www.computeexpresslink.org/download-the-specification

Alison Schofield (5):
cxl/mbox: Add GET_POISON_LIST mailbox command
cxl/trace: Add TRACE support for CXL media-error records
cxl/memdev: Add trigger_poison_list sysfs attribute
cxl/region: Add trigger_poison_list sysfs attribute
tools/testing/cxl: Mock support for Get Poison List

Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
drivers/cxl/core/mbox.c | 79 +++++++++++++++++++++++
drivers/cxl/core/memdev.c | 45 ++++++++++++++
drivers/cxl/core/region.c | 33 ++++++++++
drivers/cxl/core/trace.h | 83 +++++++++++++++++++++++++
drivers/cxl/cxlmem.h | 69 +++++++++++++++++++-
drivers/cxl/pci.c | 4 ++
tools/testing/cxl/test/mem.c | 42 +++++++++++++
8 files changed, 382 insertions(+), 1 deletion(-)


base-commit: a6591693d912a1cb88cc5a6d91a7b583481d3a84
--
2.37.3