Re: [PATCH v4 1/9] driver core: Don't let a device probe until it's ready

From: Danilo Krummrich

Date: Mon Apr 06 2026 - 11:00:21 EST


On Mon Apr 6, 2026 at 4:41 PM CEST, Doug Anderson wrote:
> Hi,
>
> On Sun, Apr 5, 2026 at 11:32 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
>>
>> > + * blocked those attempts. Now that all of the above initialization has
>> > + * happened, unblock probe. If probe happens through another thread
>> > + * after this point but before bus_probe_device() runs then it's fine.
>> > + * bus_probe_device() -> device_initial_probe() -> __device_attach()
>> > + * will notice (under device_lock) that the device is already bound.
>> > + */
>> > + dev_set_ready_to_probe(dev);
>>
>> I think this lacks some ordering properties that we should be allowed
>> to rely on. In this case, the 'ready_to_probe' flag being set should
>> that all of the data structures are observable by another CPU.
>>
>> Unfortunately, this doesn't seem to be the case, see below.
>
> I agree. I think Danilo was proposing fixing this by just doing:
>
> device_lock(dev);
> dev_set_ready_to_probe(dev);
> device_unlock(dev);
>
> While that's a bit of an overkill, it also works I think. Do folks
> have a preference for what they'd like to see in v5?

Except for the rare case where device_add() races with driver_attach(), which is
exactly the race that should be fixed by this, the device lock will be
uncontended in device_add(), so I don't consider this overkill.

>> > @@ -675,8 +691,34 @@ struct device {
>> > #ifdef CONFIG_IOMMU_DMA
>> > bool dma_iommu:1;
>> > #endif
>> > +
>> > + DECLARE_BITMAP(flags, DEV_FLAG_COUNT);
>> > };
>> >
>> > +#define __create_dev_flag_accessors(accessor_name, flag_name) \
>> > +static inline bool dev_##accessor_name(const struct device *dev) \
>> > +{ \
>> > + return test_bit(flag_name, dev->flags); \
>> > +} \
>> > +static inline void dev_set_##accessor_name(struct device *dev) \
>> > +{ \
>> > + set_bit(flag_name, dev->flags); \
>>
>> Atomic operations that are not RMW or that do not return a value are
>> unordered (see Documentation/atomic_bitops.txt). This implies that
>> observing the flag being set from another CPU does not guarantee that
>> the previous stores in program order are observed.
>>
>> For that guarantee to hold, you'd need to have an
>> smp_mb__before_atomic() just before set_bit(), giving it release
>> semantics. This is equally valid for the test, clear and assign
>> variants.
>>
>> I doubt this issue is visible on a busy system (which would be the
>> case at boot time), but I thought I'd mention it anyway.
>
> Are you suggesting I add smp memory barriers directly in all the
> accessors? ...or just that clients of these functions should use
> memory barriers as appropriate?
>
> In other words, would I do:
>
> smp_mb__before_atomic();
> dev_set_ready_to_probe(dev);
>
> ...or add the barrier into all of the accessor?

I think this would be a bit overkill; all (other) fields are either already
protected by a lock, or are not prone to reordering races otherwise.

> My thought was to not add the barrier into the accessors since at
> least one of the accessors talks about being run from a hot path
> (dma_reset_need_sync()). ...but I just want to make sure.
>
> -Doug