Re: [PATCH 037/190] Revert "RDMA/core: Fix several reference count leaks."

From: Jason Gunthorpe
Date: Wed May 05 2021 - 16:14:47 EST


On Mon, May 03, 2021 at 03:30:51PM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 29, 2021 at 03:38:41PM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Apr 28, 2021 at 10:00:44AM -0300, Jason Gunthorpe wrote:
> > > On Wed, Apr 28, 2021 at 02:23:40PM +0200, Greg Kroah-Hartman wrote:
> > >
> > > > > We've talked about this specifically before:
> > > > >
> > > > > http://lore.kernel.org/r/20210331170720.GY2710221@xxxxxxxx
> > > > >
> > > > > I still don't understand what you mean by "udev sees it properly", as
> > > > > above, all the tests I thought of look OK.
> > > >
> > > > Can you query the udev database to see the attribute values?
> > >
> > > It appears so unless I misunderstand your ask:
> > >
> > > $ udevadm info -a /sys/class/infiniband/ibp0s9
> > > ATTR{ports/1/cm_rx_duplicates/dreq}=="0"
> >
> > That works? Nice, I didn't think it did.
> >
> > But what about the uevent that fired for "1", isn't there attibutes
> > assigned to it that udev ignores?
>
> I'm not completely familiar with uevents, but:
>
> $ find /sys/class/infiniband/ibp0s9/ -name "uevent"
> /sys/class/infiniband/ibp0s9/uevent
>
> $ udevadm monitor & modprobe mlx5_ib
> KERNEL[169.337295] add /bus/auxiliary/drivers/mlx5_ib.multiport (drivers)
> UDEV [169.354621] add /bus/auxiliary/drivers/mlx5_ib.multiport (drivers)
> KERNEL[169.393088] add /devices/pci0000:00/0000:00:09.0/infiniband_verbs/uverbs0 (infiniband_verbs)
> KERNEL[169.393516] add /devices/pci0000:00/0000:00:09.0/infiniband_mad/umad0 (infiniband_mad)
> KERNEL[169.394040] add /devices/pci0000:00/0000:00:09.0/infiniband_mad/issm0 (infiniband_mad)
> UDEV [169.395189] add /devices/pci0000:00/0000:00:09.0/infiniband_verbs/uverbs0 (infiniband_verbs)
> UDEV [169.397812] add /devices/pci0000:00/0000:00:09.0/infiniband_mad/issm0 (infiniband_mad)
> KERNEL[169.407727] add /devices/pci0000:00/0000:00:09.0/net/ib0 (net)
> KERNEL[169.407851] add /devices/pci0000:00/0000:00:09.0/net/ib0/queues/rx-0 (queues)
> KERNEL[169.408113] add /devices/pci0000:00/0000:00:09.0/net/ib0/queues/tx-0 (queues)
> KERNEL[169.409059] add /devices/pci0000:00/0000:00:09.0/infiniband/mlx5_0 (infiniband)
> KERNEL[169.411483] bind /devices/pci0000:00/0000:00:09.0/mlx5_core.rdma.0 (auxiliary)
> KERNEL[169.411836] add /bus/auxiliary/drivers/mlx5_ib.rdma (drivers)
> KERNEL[169.411973] add /module/mlx5_ib (module)
> UDEV [169.420570] bind /devices/pci0000:00/0000:00:09.0/mlx5_core.rdma.0 (auxiliary)
> UDEV [169.421365] add /bus/auxiliary/drivers/mlx5_ib.rdma (drivers)
> UDEV [169.447853] add /module/mlx5_ib (module)
> KERNEL[169.482293] move /devices/pci0000:00/0000:00:09.0/infiniband/ibp0s9 (infiniband)
> UDEV [169.486395] add /devices/pci0000:00/0000:00:09.0/infiniband/mlx5_0 (infiniband)
> UDEV [169.495193] move /devices/pci0000:00/0000:00:09.0/infiniband/ibp0s9 (infiniband)
> UDEV [169.698592] add /devices/pci0000:00/0000:00:09.0/net/ib0 (net)
> UDEV [169.700436] add /devices/pci0000:00/0000:00:09.0/net/ib0/queues/rx-0 (queues)
> UDEV [169.700712] add /devices/pci0000:00/0000:00:09.0/net/ib0/queues/tx-0 (queues)
> UDEV [170.042132] add /devices/pci0000:00/0000:00:09.0/infiniband_mad/umad0 (infiniband_mad)
>
> I don't see any uevents related to the nested attributes. Same on
> removal.

With some debugging, the uevent situation is like this..

When '1' is created as a kobj the code does call

kobject_uevent(&port->kobj, KOBJ_ADD);

However DEBUG_KOBJECT reveals:

kobject: '1' (00000000d2367083): kobject_uevent_env: filter function caused the event to drop!

Which happens because
top_kobj == mlx5 (ie the struct device)
top_kobj->kset->uevent_ops == device_uevent_ops
get_ktype(kobj "1") == &port_type

Thus calling
dev_uevent_filter(mlx5 kset, kobj "1") == 0

As get_ktype(kobj "1") != &device_ktype

Which I read to mean these nested attributes under a struct device
won't generate a uevent.

The uevent for the struct device is supressed until all the child
kobjects are created and this explains how udev sees the child kobj's
and doesn't see extra uevents to confuse it.

Jason