Re: [PATCH RFC v2 06/15] vfio/nvgrace-egm: Introduce egm class and register char device numbers

From: Alex Williamson

Date: Wed Mar 04 2026 - 14:02:12 EST


On Mon, 23 Feb 2026 15:55:05 +0000
<ankita@xxxxxxxxxx> wrote:

> From: Ankit Agrawal <ankita@xxxxxxxxxx>
>
> The EGM regions are exposed to the userspace as char devices. A unique
> char device with a different minor number is assigned to EGM region
> belonging to a different Grace socket.
>
> Add a new egm class and register a range of char device numbers for
> the same.
>
> Setting MAX_EGM_NODES as 4 as the 4-socket is the largest configuration
> on Grace based systems.

Should this be a Kconfig option or have a driver module parameter or is
this a long term limit?

The use of "nodes" here is a bit confusing too since the KVM Forum
slides show each GB "node" is composed of 2-sockets. Should this be
something like MAX_NUM_EGM?

> Suggested-by: Aniket Agashe <aniketa@xxxxxxxxxx>
> Signed-off-by: Ankit Agrawal <ankita@xxxxxxxxxx>
> ---
> drivers/vfio/pci/nvgrace-gpu/egm.c | 36 ++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-gpu/egm.c
> index 999808807019..6bab4d94cb99 100644
> --- a/drivers/vfio/pci/nvgrace-gpu/egm.c
> +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c
> @@ -4,14 +4,50 @@
> */
>
> #include <linux/vfio_pci_core.h>
> +#include <linux/nvgrace-egm.h>
> +
> +#define MAX_EGM_NODES 4
> +
> +static dev_t dev;
> +static struct class *class;
> +
> +static char *egm_devnode(const struct device *device, umode_t *mode)
> +{
> + if (mode)
> + *mode = 0600;
> +
> + return NULL;
> +}
>
> static int __init nvgrace_egm_init(void)
> {
> + int ret;
> +
> + /*
> + * Each EGM region on a system is represented with a unique
> + * char device with a different minor number. Allow a range
> + * of char device creation.
> + */
> + ret = alloc_chrdev_region(&dev, 0, MAX_EGM_NODES,
> + NVGRACE_EGM_DEV_NAME);

This reserves a range of 4 minor numbers, 0-3, but then in 8/
we use the PXM number as the minor value, which according to 13/ seems
to result in egm4 and egm5 chardevs. So we're stomping on minor values
outside what we've reserved. Thanks,

Alex

> + if (ret < 0)
> + return ret;
> +
> + class = class_create(NVGRACE_EGM_DEV_NAME);
> + if (IS_ERR(class)) {
> + unregister_chrdev_region(dev, MAX_EGM_NODES);
> + return PTR_ERR(class);
> + }
> +
> + class->devnode = egm_devnode;
> +
> return 0;
> }
>
> static void __exit nvgrace_egm_cleanup(void)
> {
> + class_destroy(class);
> + unregister_chrdev_region(dev, MAX_EGM_NODES);
> }
>
> module_init(nvgrace_egm_init);