Re: [PATCH v1] firmware_class: encapsulate firmware loading status

From: Daniel Wagner
Date: Thu Aug 18 2016 - 21:35:39 EST


On 18.08.2016 18:30, Luis R. Rodriguez wrote:
> On Wed, Aug 17, 2016 at 08:47:24AM +0200, Daniel Wagner wrote:
>> On 08/10/2016 08:52 PM, Luis R. Rodriguez wrote:
>> The current 'state machine' uses three variables to handle the state
>> and the transitions.
>>
>> struct completion {
>> unsigned int done;
>> wait_queue_head_t wait;
>> };
>>
>> struct firmware_buf {
>> ...
>> struct completion completion;
>> unsigned long status;
>> ...
>> };
>>
>> Obviously, the variable 'status' holds the state. 'wait' and 'done'
>> handles the synchronization. 'done' remembers how many waiters will
>> be woken at max. complete_all() sets it to UMAX/2. That should be
>> enough in most of the cases.
>
> Thanks, this helps and makes sense. How many data structures
> in comparison does the new swait require ? Is it smaller ? If
> so that is a nice simplification indeed, however we should make
> sure we have no compromises then.

Yes we save one 'unsigned int', that is the done member of struct
completion. For an earlier version of this patch I did check the size
changes. While we save a little on the data section, the code section
increased slightly, IIRC it was around 60 bytes. Will do another
measurement.

>> So any future wait_for_completion() call will not block.
>
> This I don't get, do you mean that if we have already UMAX/2
> waiters on a completion and another one comes in, it will not
> wait at all ?

Sorry, I think I just confused you here with a implementation detail.
Whenever wait_for_completion() is woken it checks if done > 0. Then it
will decrement the counter. complete() increases the counter and then
wakes the waiter. Basically it is comparable with semamphore put and get
operation. complete_all() just sets done to max almost infinite value :)

> Is this documented well ? Either way clarifying exactly what is
> done here would be of huge help understanding the striking
> differences between a switch to the new API.

Obviously, there is Documentation/scheduler/completion.txt but the small
detail on UMAX/2 is not mentioned. I don't think it was considered to be
a real problem. I guess before you run into the problem of waking 2
billion threads you see other scaling issues first.

Note this has nothing to do with wait vs swait.

>> The patch just drops the 'done' completely because it is not
>> necessary. We have a waiter queue for all those pending waiters
>
> So there is no limit to waiters with the new API ?

Correct, the limit is gone, though I don't expect that there are so many
firmware user helper waiting that we hit the UMAX/2 limit ever.

>> and
>> as soon the final state is reached we just wake them up. The future
>> waiters will never be queued because we just check for the state
>> first.
>
> I do not follow what this means, I take it here we are talking about
> possible race conditions between a wait and some work about to be
> done?

Let me reword that. I was not really concerned about race condition
here. I was just trying to point out that we just check for the condition.

Either we have reached FW_STATUS_{DONE|ABORTED} and just continue or we
put the thread to sleep and wait for the wake call. Because we check for
the a single condition (status == FW_STATUS_{DONE|ABORTED} in
swait_event_interruptable_timeout() we don't need any addition
synchronization. Come to think about it, that is why the mutex can be
removed.

>> wait vs swait: The main difference between the two APIs is the
>> implementation. So it is pretty simple to switch from one to the
>> other. So why swait, I hear you asking. The swait implentation is
>> pretty simple for the price that you can't do all the stuff what
>> wait offers. As long you don't need the extra features of wait just
>> go with swait.
>
> OK so wait offers more features and its a kitchen sink of stuff,
> we only require a simple wait and swait is better and more light
> weight.

Yes, that summarized it pretty good.

> The above number of waiters is still something I'd like
> a bit clarification on.

As I understand the firmware loader helper userland API there is only
one waiter.

>> While the above points are nice side effect the real reason is the
>> cleanup of the code and getting rid of the mutex operations.
>
> This indeed is huge and this can better be reflected on the commit log.
> In fact I wonder if its possible to do the switch without the change
> to swait, and do the conversion to swait as a secondary step.

Not sure about it because 'status' and the operation of completion need
to be synchronized. I'll give it a try just haven't had time yet. It is
not about wait or swait, it's about completion vs s/wait.

>> I can try to split the patch into two steps. Let's see how this
>> works out. But I wouldn't mind if we go with this version :)
>
> I understand -- however I have to ask as if its possible it makes
> things easier to review and makes two logical changes split up. This
> would in turn be easier to debug if there are issues.

Sure, I completely understand. BTW, I just updated the patch and avoided
the moving of the loading_timeout. Now it doesn't contain any hard to
read section anymore.

>>> o once you have only a conversion from old wait to new swait you can
>>> inspect the delta and try to write SmPL grammar to see if you can
>>> generalize the change, so grammar can do the change for other
>>> use cases. Of course, you'd need first to look for the IRQ context,
>>> and I wonder if that's possible. If there are however generic
>>> benefits of swait over old wait when complete_all() is used (is
>>> live patching one?) then this will be very handy.
>>
>> From my attempts to figure out the execution context with SmPL I
>> fear that is rather hard to achieve because you need to create a
>> call graph and track the state.
>
> OK..

I know you have a far better understanding. We need to discuss this over
a beer :)

cheers,
daniel