Re: [PATCH v2] mm/hmm/test: simplify hmm test code: use miscdevice instead of char dev

From: Mika Penttilä
Date: Thu Mar 17 2022 - 02:59:08 EST




On 17.3.2022 7.47, Mika Penttilä wrote:


On 15.3.2022 20.39, Jason Gunthorpe wrote:
On Tue, Mar 15, 2022 at 05:22:15AM +0200, Mika Penttilä wrote:
Hi Jason and thanks for your comments..

On 14.3.2022 20.24, Jason Gunthorpe wrote:
On Fri, Mar 11, 2022 at 05:30:50AM +0200, mpenttil@xxxxxxxxxx wrote:
From: Mika Penttilä <mpenttil@xxxxxxxxxx>

HMM selftests use an in-kernel pseudo device to emulate device private
memory. The pseudo device registers a major device range for two pseudo
device instances. User space has a script that reads /proc/devices in
order to find the assigned major number, and sends that to mknod(1),
once for each node.

This duplicates a fair amount of boilerplate that misc device can do
instead.

Change this to use misc device, which makes the device node names appear
for us. This also enables udev-like processing if desired.

This is borderline the wrong way to use misc devices, they should
never be embedded into other structs like this. It works out here
because they are eventually only placed in a static array, but still
it is a generally bad pattern to see.

Could you elaborate on this one? We have many in-tree usages of the same
pattern, like:

The kernel is full of bugs

drivers/video/fbdev/pxa3xx-gcu.c

ie this is broken because it allocates like this:

         priv = devm_kzalloc(dev, sizeof(struct pxa3xx_gcu_priv), GFP_KERNEL);
         if (!priv)
                 return -ENOMEM;

And free's via devm:


static int pxa3xx_gcu_remove(struct platform_device *pdev)
{
         struct pxa3xx_gcu_priv *priv = platform_get_drvdata(pdev);

         misc_deregister(&priv->misc_dev);
         return 0;
}

But this will UAF if it races fops open with misc_desregister.


Yes this driver is broken because platform_device and miscdevice have unrelated lifetimes.


Proper use of cdevs with proper struct devices prevent this bug.

You mention "placed in a static array", are you seeing a potential lifetime
issue or what? Many of the examples above embed miscdevice in a dynamically
allocated object also.

The file object's private_data holds a pointer to the miscdevice, and
fops_get() pins the module. So freeing the objects miscdevice is embedded in
at module_exit time should be fine. But, as you said, in this case the
miscdevices are statically allocated, so that shouldn't be an issue
either.

Correct, it is OK here because the module refcounts prevent the
miscdevice memory from being freed, the above cases with dynamic
allocations do not have that protection and are wrong.

This is why I don't care for the pattern of putting misc devices
inside other structs, it suggests this is perhaps generally safe but
it is not.

I think using cdev_add ends up in the same results in device_* api
sense.

Nope, everything works right once you use cdev_device_add on a
properly registered struct device.

miscdevice acting like a mux at a higher abstraction level simplifies the
code.

It does avoid the extra struct device, but at the cost of broken
memory lifetime

No, misc_register() ends up calling device_create_with_groups() so there is struct device involved. cdev_device_add() would make the explicit struct_device as a parent of the cdev kobj ensuring that struct_device (and maybe structure containing it) is not free before the cdev. But the lifetime of the objects here are controlled by the module lifetime. Note, there is also cdev involved with miscdevice protecting misc.ko and our fops protecting our module unload.

I don't mind using the cdev APIs per se, just would like to do for the right reasons. Using cdev apis might be just overkill for many usages, and that's where miscdevice is useful. It miscdevice is broken in some subtle ways I think it should be better documented, or better yet, fixed.



Jason



Thanks,
Mika

With the cdev approach, the patch (kernel parts) is not too bad either :

Thanks,
Mika

---

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 767538089a62..566e7142f33f 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -29,11 +29,17 @@

#include "test_hmm_uapi.h"

-#define DMIRROR_NDEVICES 2
#define DMIRROR_RANGE_FAULT_TIMEOUT 1000
#define DEVMEM_CHUNK_SIZE (256 * 1024 * 1024U)
#define DEVMEM_CHUNKS_RESERVE 16

+static const char *dmirror_device_names[] = {
+ "hmm_dmirror0",
+ "hmm_dmirror1"
+};
+
+#define DMIRROR_NDEVICES ARRAY_SIZE(dmirror_device_names)
+
static const struct dev_pagemap_ops dmirror_devmem_ops;
static const struct mmu_interval_notifier_ops dmirror_min_ops;
static dev_t dmirror_dev;
@@ -74,7 +80,7 @@ struct dmirror {
* ZONE_DEVICE pages for migration and simulating device memory.
*/
struct dmirror_chunk {
- struct dev_pagemap pagemap;
+ struct dev_pagemap pagemap;
struct dmirror_device *mdevice;
};

@@ -82,8 +88,9 @@ struct dmirror_chunk {
* Per device data.
*/
struct dmirror_device {
- struct cdev cdevice;
- struct hmm_devmem *devmem;
+ struct cdev cdevice;
+ struct device device;
+ struct hmm_devmem *devmem;

unsigned int devmem_capacity;
unsigned int devmem_count;
@@ -132,7 +139,7 @@ static int dmirror_fops_open(struct inode *inode, struct file *filp)
xa_init(&dmirror->pt);

ret = mmu_interval_notifier_insert(&dmirror->notifier, current->mm,
- 0, ULONG_MAX & PAGE_MASK, &dmirror_min_ops);
+ 0, ULONG_MAX & PAGE_MASK, &dmirror_min_ops);
if (ret) {
kfree(dmirror);
return ret;
@@ -1225,7 +1232,11 @@ static int dmirror_device_init(struct dmirror_device *mdevice, int id)

cdev_init(&mdevice->cdevice, &dmirror_fops);
mdevice->cdevice.owner = THIS_MODULE;
- ret = cdev_add(&mdevice->cdevice, dev, 1);
+ device_initialize(&mdevice->device);
+ dev_set_name(&mdevice->device, "%s", dmirror_device_names[id]);
+ mdevice->device.devt = dev;
+
+ ret = cdev_device_add(&mdevice->cdevice, &mdevice->device);
if (ret)
return ret;