Re: [PATCH RFC v2 00/18] DCD: Add support for Dynamic Capacity Devices (DCD)

From: Ira Weiny
Date: Mon Sep 11 2023 - 22:21:36 EST


Fan Ni wrote:
> On Mon, Aug 28, 2023 at 10:20:51PM -0700, Ira Weiny wrote:

Sorry for the delay, I've been walking through the responses and just saw
this.

>
> Hi Ira,
>
> I tried to test the patch series with the qemu dcd patches, however, I
> hit some issues, and would like to check the following with you.
>
> 1. After we create a region for DC before any extents are added, a dax
> device will show under /dev. Is that what we want?

Yes, see

cxl/region: Add Dynamic Capacity CXL region support

"Special case DC capable CXL regions to create a 0 sized seed DAX
device until others can be created on dynamic space later."

The seed device is required but is left empty. It can be resized when
extents are added later.

> If I remember it
> correctly, the dax device used to show up after a dc extent is added.
>
>
> 2. add/release extent does not work correctly for me. The code path is
> not called, and I made the following changes to make it pass.

:-(

This is the problem with cxl_test... I've just realized this after seeing
Jorgen's email regarding the interrupt configuration code. I've added it
back in. I'm not sure where it got lost along the way but it was
completely gone from this RFC v2. Sorry about that.

> ---
> drivers/cxl/cxl.h | 3 ++-
> drivers/cxl/cxlmem.h | 1 +
> drivers/cxl/pci.c | 7 +++++++
> 3 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 2c73a30980b6..0d132c1739ce 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -168,7 +168,8 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> #define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO | \
> CXLDEV_EVENT_STATUS_WARN | \
> CXLDEV_EVENT_STATUS_FAIL | \
> - CXLDEV_EVENT_STATUS_FATAL)
> + CXLDEV_EVENT_STATUS_FATAL| \
> + CXLDEV_EVENT_STATUS_DCD)
>
> /* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> #define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 8ca81fd067c2..ae9dcb291c75 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -235,6 +235,7 @@ struct cxl_event_interrupt_policy {
> u8 warn_settings;
> u8 failure_settings;
> u8 fatal_settings;
> + u8 dyncap_settings;
> } __packed;
>
> /**
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 10c1a583113c..e30fe0304514 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -686,6 +686,7 @@ static int cxl_event_config_msgnums(struct cxl_memdev_state *mds,
> .warn_settings = CXL_INT_MSI_MSIX,
> .failure_settings = CXL_INT_MSI_MSIX,
> .fatal_settings = CXL_INT_MSI_MSIX,
> + .dyncap_settings = CXL_INT_MSI_MSIX,
> };
>
> mbox_cmd = (struct cxl_mbox_cmd) {
> @@ -739,6 +740,12 @@ static int cxl_event_irqsetup(struct cxl_memdev_state *mds)
> return rc;
> }
>
> + rc = cxl_event_req_irq(cxlds, policy.dyncap_settings);
> + if (rc) {
> + dev_err(cxlds->dev, "Failed to get interrupt for event dyncap log\n");
> + return rc;
> + }
> +
> return 0;
> }
>
> --
>
> 3. With changes made in 2, the code for add/release dc extent can be called,
> however, the system behaviour seems different from before. Previously, after a
> dc extent is added, it will show up with lsmem command and listed as offline.
> Now, nothing is showing. Is it expected? What should we do to make it usable
> as system ram?

Yes this behavior was not correct before. DAX devices should be flexible
to be created throughout the region. Either within extents or across
extents. Dave Jiang mentioned to me internally it might help to add some
ASCII art documentation regarding how this works. Generally, the dax
region available size will increase when extents are added and new dax
devices can be created to utilize that space.

Check out the dcd-test.sh in ndctl at this link for the commands to create
a dax device in the new architecture.

https://github.com/weiny2/ndctl/tree/dcd-region2

Hope this helps.

>
> Please let me know if I miss something or did something wrong. Thanks.

You did not. I thought the new dax code would explain this new dax device
operation.

Some new documentation is in order.

Ira