Re: [RFC] firmware load: defer request_firmware during early boot and resume

From: Ming Lei
Date: Sat Jul 21 2012 - 19:24:13 EST


On Sun, Jul 22, 2012 at 4:38 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I agree that this is a problem. At the same time, early boot has some
> of the exact same problems as resume has, and I do wish that people
> would ask themselves: "why do I try to load the firmware at early boot
> time"?
>
> There is really only *one* real reason to load firmware at device
> probe time, and that's because the device is needed for the boot. But
> in that case, deferral is wrong, isn't it?

Linus, sorry for forget sending all, :-(

Maybe not, I know some usb devices, if no firmware is downloaded
into device, you can't use them at all. For example, the isight camera,
only downloading firmware can make the device look like a UVC
camera device. Also there are some usb BT devices alike, such as
ath3k.

For this kind of devices, deferral should be correct, IMO.

>
> And if the device isn't needed for boot, then why is it loading so
> early? For network devices, for example (and this is a *common*
> issue), firmware should be loaded not at device init, but at device
> *open* time, exactly because we don't want to load it too early when
> it might not even be available yet.

Yes, I agree, we should load firmware just before the device is used in
theory. Also this may be helpful to deal with the caching firmware before
suspend, see below.

But, is one device capable of being downloaded firmware more
than one times? It is still a question and only can be replied by its
vendors. Because one device is only powered on one time and it
is probably or reasonable that the device only supports to be
downloaded firmware just one time. Also, suppose the firmware is
based on linux kernel and application, it is not easy to update the
system online.

Also some devices may have not 'open' interfaces and it is just
accessed by sysfs or other kernel built-in interfaces.

>
> So I would prefer if people basically just understood that "if you're
> trying to load firmware at module init time, you are almost certainly
> doing something wrong".
>
> Delaying the firmware load as much as possible (and here "delaying"
> does *not* mean your kind of "deferred" load, but explicitly doing it
> only when really needed) allows things like "boot the system, copy the
> new firmware from a USB stick in single-user mode, then bring up
> networking". It also simply avoids the whole module load ordering
> issue.
>
> So I really think you are looking (again) too much at working around
> the symptoms, rather than fixing the deeper issue.

OK, I will take time to look at more request_firmware uses in drivers/
and study the related problems above.

>
>> It is a good idea to let the driver defer request explicitly, but still need
>> some changes in generic code to support it.
>
> That's fine. I am not arguing against making core driver core changes,
> I'm just arguing against making them so that you facilitate bad
> behavior and work around the symptoms of bad choices.
>
> In fact, I'd actually want to argue for *bigger* core device layer
> changes to make it easier to do the right thing. Right now, one of the
> reasons why driver writers load the firmware at init time is that it's
> often _easiest_ for them to do it there, even if it's the wrong point
> to do it. And that is partly because I think the device layer doesn't
> help enough in making it really convenient to do later.
>
>> In my opinion, we should cache firmware data for all hotplug
>> devices or devices which may experience power loss automatically
>> in kernel during suspend-resume cycle because all such devices may be
>> disconnected and connected again during suspend-resume cycle.
>
> Yes. *THAT* is absolutely the kind of change I'd love to see. The core

OK, I'd like to volunteer to improve firmware loading with caching fw
during suspend/resume cycle.

> device layer doesn't really make it easy to handle firmware sanely
> over suspend/resume, which is kind of sad. Why does every driver have
> to have its own "let me remember my firmware over the suspend/resume
> event" and have extra code in suspend/resume, when it's really a
> pretty generic situation: if the device has firmware, wouldn't it be
> really nice if the core driver layer just knew about that and kept
> track of it?
>
>> Looks it is not difficult to cache firmware data by kernel, for example, just
>> call the
>>
>> cache_firmware(fw_name)
>>
>> for each device which need firmware before suspending,
>> then call the below to uncache firmware after resume:
>>
>> uncache_firmware(fw_name)
>
> Exactly. But we should make it automatic, and we should only do it if

Yes, I mean the cache/uncache firmware should be done automatically
before suspend and after resume, and it will be implemented inside
driver core.

> the device is actually *active*. If nobody is using the device over

As you said above, suppose devices are active just after its firmware
has been downloaded, we can decide one device if it is active by
observing having downloaded firmware into it or not.

But I know, some drivers needed to be fixed to delay request/download
firmware until it is used actively.

> the suspend-resume event, the firmware shouldn't be loaded in the
> first place, and resume obviously shouldn't need to re-load it.
> Wouldn't it be nice if something like the PCI layer (or the USB layer)
> just knew to do the rigth thing for the device on its own?
>
> I would also suggest that the firmware caching have some internal
> timeout, so that for the (fairly common) case where a suspend/resume
> event might look like a unplug/replug event, the caching would
> actually still remember the firmware despite the fact that it looked
> (for a short while) like the device went away.
>
> So *this* is where I think we could improve on the generic code. Make
> it really easy for devices to do the right thing. Make sure that
> firmware caches work, even if it looks like devices disappeared
> momentarily. Maybe add a few callbacks from generic code to say "you
> can load your firmware now, because the system is up".

IMO, if firmware cache is related with device or driver lifetime, the
problem will become a bit complicated:

Firmware data lifetime may be longer than driver/device's lifetime.

So I will start the work by just caching firmware before suspend
and uncaching firmware after resume or some time later after resume
for actively used devices, and not related with device/driver's lifetime
first. It should be the simplest and the most reliable approach.


> So I really think the rules should always be:
>
> - firmware should NEVER be loaded at module init time, because it's
> the wrong time to do it - the device may never be needed at all.

OK, but maybe some devices described above need to load firmware
at probe().

> Slowing down the init sequence is just stupid.

Deferring request_firmware doesn't slow down the init sequence.

>
> - If firmware is needed for resume, it should be loaded by the
> suspend logic and cached in memory.

Agree, some special device(isight) may choose to defer loading
firmware by themselves.


Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/