Re: [RESEND] drivercore: deferral race condition fix

From: Peter Ujfalusi
Date: Tue Apr 08 2014 - 09:16:46 EST


On 04/08/2014 03:47 PM, Grant Likely wrote:
> On Tue, Apr 8, 2014 at 3:27 AM, Grant Likely <grant.likely@xxxxxxxxxxxx> wrote:
>> On Thu, 3 Apr 2014 10:40:59 +0100, Mark Brown <broonie@xxxxxxxxxx> wrote:
>>> On Thu, Apr 03, 2014 at 10:12:07AM +0300, Peter Ujfalusi wrote:
>>>> When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
>>>> when all modules loaded but some driver still stuck in the deferred list
>>>> and there is a need for external event to kick the deferred queue to probe
>>>> these drivers.
>>>
>>> Acked-by: Mark Brown <broonie@xxxxxxxxxx>
>>
>> It's a pretty crude solution though. The problem is any "in-flight"
>> probes that are going to defer will not get added to the active list.
>> Rerunning the entire active list is a bit much (but it does have the
>> advantage of still being conceptually simple). I think we can do better.
>>
>> Instead of running the entire list, we could add a check to
>> driver_deferred_probe_add() that adds the device to the active list
>> instead of pending list on the condition that another driver probe
>> completed while the deferred probe was in-flight.
>>
>> I'm playing with a solution now. I'll email a proposal shortly.
>
> Thinking out loud now...
>
> The race can occur whenever a probe in another thread completes
> successfully while the current probe is in-flight. If that has
> happened, then the defer condition may be resolved and the driver
> should be scheduled for retry immediately. If the core code can check
> for that condition, then we can add the driver directly to the active
> list and kick the workqueue.
>
> The problem is that we don't currently have an easy way to test if a
> probe has completed in another thread. This patch handles it with a
> single flag that gets set whenever a probe completes while another
> probe is executing. I was worried that this approach would be racy,
> but after running through the scenarios I can't find a situation where
> it wouldn't get added. I only concern I have remaining on this
> approach is that it will trigger unnecessary retries, but even that
> isn't really a problem because the pending list will have been moved
> to the active list *anyway*. It isn't even a retry of the whole list
> that's happening because most likely the only device on the pending
> list will be the one that completed with -EPROBE_DEFER.

This code will only going to add one retry and it is going to that only under
the condition you have described:
the last driver which finishes it's probe ends up with -EPROBE_DEFER and while
it's probe was in-flight another driver loaded with success.
In all other cases this will not trigger a retry:
If you load only one driver which ends up deferring,
If the driver which differing is not the last driver to load - since we will
have other drivers to be loaded and they will kick the list anyways.

> So, I actually think this is the right approach now. I'll reply to the
> patch itself and make some comments on the code.

Thanks, I'll check the comments.

--
Péter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/