Re: [PATCH v4 1/9] driver core: Don't let a device probe until it's ready
From: Doug Anderson
Date: Mon Apr 06 2026 - 10:50:50 EST
Hi,
On Sun, Apr 5, 2026 at 11:32 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
>
> > + * blocked those attempts. Now that all of the above initialization has
> > + * happened, unblock probe. If probe happens through another thread
> > + * after this point but before bus_probe_device() runs then it's fine.
> > + * bus_probe_device() -> device_initial_probe() -> __device_attach()
> > + * will notice (under device_lock) that the device is already bound.
> > + */
> > + dev_set_ready_to_probe(dev);
>
> I think this lacks some ordering properties that we should be allowed
> to rely on. In this case, the 'ready_to_probe' flag being set should
> that all of the data structures are observable by another CPU.
>
> Unfortunately, this doesn't seem to be the case, see below.
I agree. I think Danilo was proposing fixing this by just doing:
device_lock(dev);
dev_set_ready_to_probe(dev);
device_unlock(dev);
While that's a bit of an overkill, it also works I think. Do folks
have a preference for what they'd like to see in v5?
> > @@ -675,8 +691,34 @@ struct device {
> > #ifdef CONFIG_IOMMU_DMA
> > bool dma_iommu:1;
> > #endif
> > +
> > + DECLARE_BITMAP(flags, DEV_FLAG_COUNT);
> > };
> >
> > +#define __create_dev_flag_accessors(accessor_name, flag_name) \
> > +static inline bool dev_##accessor_name(const struct device *dev) \
> > +{ \
> > + return test_bit(flag_name, dev->flags); \
> > +} \
> > +static inline void dev_set_##accessor_name(struct device *dev) \
> > +{ \
> > + set_bit(flag_name, dev->flags); \
>
> Atomic operations that are not RMW or that do not return a value are
> unordered (see Documentation/atomic_bitops.txt). This implies that
> observing the flag being set from another CPU does not guarantee that
> the previous stores in program order are observed.
>
> For that guarantee to hold, you'd need to have an
> smp_mb__before_atomic() just before set_bit(), giving it release
> semantics. This is equally valid for the test, clear and assign
> variants.
>
> I doubt this issue is visible on a busy system (which would be the
> case at boot time), but I thought I'd mention it anyway.
Are you suggesting I add smp memory barriers directly in all the
accessors? ...or just that clients of these functions should use
memory barriers as appropriate?
In other words, would I do:
smp_mb__before_atomic();
dev_set_ready_to_probe(dev);
...or add the barrier into all of the accessor?
My thought was to not add the barrier into the accessors since at
least one of the accessors talks about being run from a hot path
(dma_reset_need_sync()). ...but I just want to make sure.
-Doug