Re: [PATCH] soundwire: intel_auxdevice: Don't disable IRQs before removing children

From: Charles Keepax
Date: Fri Dec 20 2024 - 13:20:15 EST


On Thu, Dec 19, 2024 at 10:27:53AM +0000, Charles Keepax wrote:
> On Wed, Dec 18, 2024 at 04:40:22PM -0500, Pierre-Louis Bossart wrote:
> > Having looked at the code in more details, I think there are other issues,
> > see e.g. this part of the code called from snd_bus_master_delete().
> >
> > static int sdw_delete_slave(struct device *dev, void *data)
> > {
> > struct sdw_slave *slave = dev_to_sdw_dev(dev);
> > struct sdw_bus *bus = slave->bus;
> >
> > pm_runtime_disable(dev);
> >
> > sdw_slave_debugfs_exit(slave);
> >
> > mutex_lock(&bus->bus_lock);
> >
> > if (slave->dev_num) { /* clear dev_num if assigned */
> > clear_bit(slave->dev_num, bus->assigned);
> > if (bus->ops && bus->ops->put_device_num)
> > bus->ops->put_device_num(bus, slave);
> > }
> >
> > So at this point an interaction with the device is not longer possible, even
> > if the Cadence interrupts are kept active, since there's no valid device
> > number to use...
> >
> > list_del_init(&slave->node);
> > mutex_unlock(&bus->bus_lock);
> >
> > ... but this is where the .remove() will take place.
> >
> > device_unregister(dev);
> > return 0;
> > }
> >
> > What am I missing?
>
> Hmm... yes that is a good spot, I will investigate that further.
> Certainly I do see these operations happening in the wrong order
> but it doesn't seem to cause the register transactions to fail.
> Most likely it is best to reverse these two, it makes sense
> to not clear the device number until we are finished with the
> device, but would be good to understand what is going on first.

Ok, so yeah nothing on the read/write path checks assigned. So yeah
it has marked the device as not being assigned but that doesn't
actually stop anything from communicating with the device, the
dev_num is still the same one the hardware has. Assigned is only
checked when handling slave status events.

So I think your point is valid this code should also be changed
although the only way it can currently go wrong is if a new device
registered on the bus at exactly this moment and was assigned the
dev_num of the old device.

Likely will be the new year before I get the time to think
through the details.

Thanks,
Charles