Re: INFO: REPRODUCED: memory leak in gpio device in 6.2-rc6

From: Bartosz Golaszewski
Date: Wed Feb 22 2023 - 05:53:51 EST


On Tue, Feb 21, 2023 at 4:41 PM Andy Shevchenko
<andriy.shevchenko@xxxxxxxxx> wrote:
>
> On Tue, Feb 21, 2023 at 02:52:38PM +0100, Mirsad Goran Todorovac wrote:
> > On 20. 02. 2023. 14:43, Andy Shevchenko wrote:
> > > On Mon, Feb 20, 2023 at 02:10:00PM +0100, Mirsad Todorovac wrote:
> > > > On 2/16/23 15:16, Bartosz Golaszewski wrote:
>
> ...
>
> > > > As Mr. McKenney once said, a bunch of monkeys with keyboard could
> > > > have done it in a considerable number of trials and errors ;-)
> > > >
> > > > But here I have something that could potentially leak as well. I could not devise a
> > > > reproducer due to the leak being lightly triggered only in extreme memory contention.
> > > >
> > > > See it for yourself:
> > > >
> > > > drivers/gpio/gpio-sim.c:
> > > > 301 static int gpio_sim_setup_sysfs(struct gpio_sim_chip *chip)
> > > > 302 {
> > > > 303 struct device_attribute *val_dev_attr, *pull_dev_attr;
> > > > 304 struct gpio_sim_attribute *val_attr, *pull_attr;
> > > > 305 unsigned int num_lines = chip->gc.ngpio;
> > > > 306 struct device *dev = chip->gc.parent;
> > > > 307 struct attribute_group *attr_group;
> > > > 308 struct attribute **attrs;
> > > > 309 int i, ret;
> > > > 310
> > > > 311 chip->attr_groups = devm_kcalloc(dev, sizeof(*chip->attr_groups),
> > > > 312 num_lines + 1, GFP_KERNEL);
> > > > 313 if (!chip->attr_groups)
> > > > 314 return -ENOMEM;
> > > > 315
> > > > 316 for (i = 0; i < num_lines; i++) {
> > > > 317 attr_group = devm_kzalloc(dev, sizeof(*attr_group), GFP_KERNEL);
> > > > 318 attrs = devm_kcalloc(dev, GPIO_SIM_NUM_ATTRS, sizeof(*attrs),
> > > > 319 GFP_KERNEL);
> > > > 320 val_attr = devm_kzalloc(dev, sizeof(*val_attr), GFP_KERNEL);
> > > > 321 pull_attr = devm_kzalloc(dev, sizeof(*pull_attr), GFP_KERNEL);
> > > > 322 if (!attr_group || !attrs || !val_attr || !pull_attr)
> > > > 323 return -ENOMEM;
> > > > 324
> > > > 325 attr_group->name = devm_kasprintf(dev, GFP_KERNEL,
> > > > 326 "sim_gpio%u", i);
> > > > 327 if (!attr_group->name)
> > > > 328 return -ENOMEM;
> > > >
> > > > Apparently, if the memory allocation only partially succeeds, in the theoretical case
> > > > that the system is close to its kernel memory exhaustion, `return -ENOMEM` would not
> > > > free the partially succeeded allocs, would it?
> > > >
> > > > To explain it better, I tried a version that is not yet full doing "all or nothing"
> > > > memory allocation for the gpio-sim driver, because I am not that familiar with the
> > > > driver internals.
> > >
> > > devm_*() mean that the resource allocation is made in a managed manner, so when
> > > it's done, it will be freed automatically.
> >
> > Didn't see that one coming ... :-/ "buzzing though the bush ..."
> >
> > > The question is: is the lifetime of the attr_groups should be lesser or the
> > > same as chip->gc.parent? Maybe it's incorrect to call devm_*() in the first place?
> >
> > Bona fide said, I hope that automatic deallocation does things in the right order.
> > I've realised that devm_kzalloc() calls devm_kmalloc() that registers allocations on
> > a per driver list. But I am not sure how chip->gc was allocated?
> >
> > Here is said it is allocated in drivers/gpio/gpio-sim.c:386 in gpio_sim_add_bank(),
> > as a part of
> >
> > struct gpio_sim_chip *chip;
> > struct gpio_chip *gc;
> >
> > gc = &chip->gc;
> >
> > and gc->parent is set to
> >
> > gc->parent = dev;
> >
> > in line 420, which appears called before gpio_sim_setup_sysfs() and the lines above.
> >
> > If I understood well, automatic deallocation on unloading the driver goes
> > in the reverse order, so lifetime of chip appears to be longer than attr_groups,
> > but I am really not that good at this ...
>
> So, the device is instantiated by platform_device_register_full().
>
> It should gone with the platform_device_unregister().
>
> In case of CONFIG_DEBUG_KOBJECT_RELEASE=y the ->release() can be called
> asynchronously.
>
> So, there are following questions:
> - is the put_device() is actually called?
> - is the above mentioned option is set to Y?
> - if it's in Y, does kmemleak take it into account?
> - if no, do you get anything new in `dmesg` when enable it?
>
> > > Or maybe the chip->gc.parent should be changed to something else (actual GPIO
> > > device, but then it's unclear how to provide the attributes in non-racy way
> > Really, dunno. I have to repeat that my learning curve cannot adapt so quickly.
> >
> > I merely gave the report of KMEMLEAK, otherwise I am not a Linux kernel
> > device expert nor would be appropriate to try the craft not earned ;-)
>
> --
> With Best Regards,
> Andy Shevchenko
>
>

Mirsad,

I think you fear that the memory allocated for sysfs attributes could
be accessed after the driver is detached from the simulated GPIO
device? This is not possible as sysfs handles that gracefully (by
removing all sysfs attributes with driver_sysfs_remove()) before
freeing devres resources. You can test that yourself by instantiating
a gpio-sim device, opening and holding a file descriptor to one of the
sysfs attributes, disabling the device and then trying to read from
said fd - it will return -ENODEV.

Let me know if you actually mean something else?

Bart