Re: [PATCH v3 0/4] Add managed SOFT RESERVE resource handling

From: Bowman, Terry
Date: Mon Apr 14 2025 - 10:42:14 EST


Hi Zhijian,

We recreated the failure for the cases you mentioned below. We will be
adding the fix into v4 I am working on now.

Regards,
Terry



On 4/7/2025 2:31 AM, Zhijian Li (Fujitsu) wrote:
> Hi Terry,
>
> If I understand correctly, this patch set has only considered the situation where the
> soft reserved area and the region are exactly the same, as in pattern 1.
>
> However, I believe we also need to consider situations where these two are not equal,
> which are outlined in pattern 2 and 3 below. Let me explain them:
>
> ===========================================
> Pattern 1:
> - region0 will be created during OS booting due to programed hdm decoder
> - After OS booted, region0 can be re-created again after destroy it
> ┌────────────────────┐
> │ CFMW │
> └────────────────────┘
> ┌────────────────────┐
> │ reserved0 │
> └────────────────────┘
> ┌────────────────────┐
> │ mem0 │
> └────────────────────┘
> ┌────────────────────┐
> │ region0 │
> └────────────────────┘
>
>
> Pattern 2:
> The HDM decoder is not in a committed state, so during the kernel boot process,
> egion0 will not be created automatically. In this case, the soft reserved area will
> not be removed from the iomem tree. After the OS starts,
> users cannot create a region (cxl create-region) either, as there should
> be an intersection between the soft reserved area and the region.
>
> ┌────────────────────┐
> │ CFMW │
> └────────────────────┘
> ┌────────────────────┐
> │ reserved0 │
> └────────────────────┘
> ┌────────────────────┐
> │ mem0* │
> └────────────────────┘
> ┌────────────────────┐
> │ N/A │ region0
> └────────────────────┘
> *HDM decoder in mem0 is not committed.
>
>
> Pattern 3:
> Region0 is a child of the soft reserved area. In this case, the soft reserved area will
> not be removed from the iomem tree, resulting in being unable to be recreated later after destroy.
> ┌────────────────────┐
> │ CFMW │
> └────────────────────┘
> ┌────────────────────┐
> │ reserved │
> └────────────────────┘
> ┌────────────────────┐
> │ mem0 | mem1* │
> └────────────────────┘
> ┌────────────────────┐
> │region0 | N/A │ region1
> └────────────────────┘
> *HDM decoder in mem1 is not committed.
>
>
> Thanks
> Zhijian
>
>
>
> On 04/04/2025 02:33, Terry Bowman wrote:
>> Add the ability to manage SOFT RESERVE iomem resources prior to them being
>> added to the iomem resource tree. This allows drivers, such as CXL, to
>> remove any pieces of the SOFT RESERVE resource that intersect with created
>> CXL regions.
>>
>> The current approach of leaving the SOFT RESERVE resources as is can cause
>> failures during hotplug of devices, such as CXL, because the resource is
>> not available for reuse after teardown of the device.
>>
>> The approach is to add SOFT RESERVE resources to a separate tree during
>> boot. This allows any drivers to update the SOFT RESERVE resources before
>> they are merged into the iomem resource tree. In addition a notifier chain
>> is added so that drivers can be notified when these SOFT RESERVE resources
>> are added to the ioeme resource tree.
>>
>> The CXL driver is modified to use a worker thread that waits for the CXL
>> PCI and CXL mem drivers to be loaded and for their probe routine to
>> complete. Then the driver walks through any created CXL regions to trim any
>> intersections with SOFT RESERVE resources in the iomem tree.
>>
>> The dax driver uses the new soft reserve notifier chain so it can consume
>> any remaining SOFT RESERVES once they're added to the iomem tree.
>>
>> V3 updates:
>> - Remove srmem resource tree from kernel/resource.c, this is no longer
>> needed in the current implementation. All SOFT RESERVE resources now
>> put on the iomem resource tree.
>> - Remove the no longer needed SOFT_RESERVED_MANAGED kernel config option.
>> - Add the 'nid' parameter back to hmem_register_resource();
>> - Remove the no longer used soft reserve notification chain (introduced
>> in v2). The dax driver is now notified of SOFT RESERVED resources by
>> the CXL driver.
>>
>> v2 updates:
>> - Add config option SOFT_RESERVE_MANAGED to control use of the
>> separate srmem resource tree at boot.
>> - Only add SOFT RESERVE resources to the soft reserve tree during
>> boot, they go to the iomem resource tree after boot.
>> - Remove the resource trimming code in the previous patch to re-use
>> the existing code in kernel/resource.c
>> - Add functionality for the cxl acpi driver to wait for the cxl PCI
>> and me drivers to load.
>>
>> Nathan Fontenot (4):
>> kernel/resource: Provide mem region release for SOFT RESERVES
>> cxl: Update Soft Reserved resources upon region creation
>> dax/mum: Save the dax mum platform device pointer
>> cxl/dax: Delay consumption of SOFT RESERVE resources
>>
>> drivers/cxl/Kconfig | 4 ---
>> drivers/cxl/acpi.c | 28 +++++++++++++++++++
>> drivers/cxl/core/Makefile | 2 +-
>> drivers/cxl/core/region.c | 34 ++++++++++++++++++++++-
>> drivers/cxl/core/suspend.c | 41 ++++++++++++++++++++++++++++
>> drivers/cxl/cxl.h | 3 +++
>> drivers/cxl/cxlmem.h | 9 -------
>> drivers/cxl/cxlpci.h | 1 +
>> drivers/cxl/pci.c | 2 ++
>> drivers/dax/hmem/device.c | 47 ++++++++++++++++----------------
>> drivers/dax/hmem/hmem.c | 10 ++++---
>> include/linux/dax.h | 11 +++++---
>> include/linux/ioport.h | 3 +++
>> include/linux/pm.h | 7 -----
>> kernel/resource.c | 55 +++++++++++++++++++++++++++++++++++---
>> 15 files changed, 202 insertions(+), 55 deletions(-)
>>
>>
>> base-commit: aae0594a7053c60b82621136257c8b648c67b512