Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()
From: Luis R. Rodriguez
Date: Thu Sep 15 2016 - 11:43:55 EST
On Mon, Sep 12, 2016 at 02:09:10PM +0200, Daniel Wagner wrote:
> On 09/10/2016 12:10 AM, Luis R. Rodriguez wrote:
> > I've personally come to the determination we can stuff all this crap under
> > CONFIG_FW_LOADER_USER_HELPER due to the firmware cache. Here's why:
> >
> > As Daniel has it, the usermodehelper_read_lock_wait() case this unfolds into a:
> >
> > schedule_timeout(0) when the usermodehelper_disabled is true, since after
> > the freezer was introduced this can be any of these values:
> >
> > enum umh_disable_depth {
> > UMH_ENABLED = 0,
> > UMH_FREEZING,
> > UMH_DISABLED,
> > };
> >
> > It means we schedule_timeout(0) anytime usermodehelper_disabled is either
> > UMH_FREEZING or UMH_DISABLED. Contrary to usermodehelper_read_trylock() it
> > will also *not* call try_to_freeze().
> >
> > Given how we setup timer for this:
> >
> > expire = timeout + jiffies;
> >
> > setup_timer_on_stack(&timer, process_timeout, (unsigned long)current);
> > __mod_timer(&timer, expire, false);
> > schedule();
> > del_singleshot_timer_sync(&timer);
> >
> > We will naturally *always* timeout on schedule_timeout(0), so this return 0
> > always.
>
> The smallest possible timeout is 1 * HZ because:
>
> static inline long firmware_loading_timeout(void)
> {
> return loading_timeout > 0 ? loading_timeout * HZ : MAX_JIFFY_OFFSET;
> }
Sorry, I confused your setting:
+#define fw_status_wait_timeout(fw_st, long) 0
for replacing the abvove firmware_loading_timeout() to be 0 when
!CONFIG_FW_LOADER_USER_HELPER
> > As Daniel has it when CONFIG_FW_LOADER_USER_HELPER is enabled we do have a
> > timeout and that just ensures we schedule_timeout(timeout) and as such can run
> > in the loop for usermodehelper_read_trylock() trying to see if the
> > usermodehelper_disabled has finally flipped to UMH_ENABLED, after which we'd
> > grant entry, say you can pass Go, and collect $200.
> >
> > Since we use schedule_timeout(0) for !CONFIG_FW_LOADER_USER_HELPER it means
> > we *always* break out of the loop immediately, and return 0. If we are
> > in !CONFIG_FW_LOADER_USER_HELPER and the usermodehelper_disabled is set
> > to UMH_ENABLED we *immediately* bail and return 0 as well.
>
> The mainline version will use 60s timeout for the usermode helper
> for !CONFIG_FW_LOADER_USER_HELPER. In case of the
> CONFIG_FW_LOADER_USER_HELPER we should be the range of 1 * HZ to
> MAX_JIFFY_OFFSET. I say should because we don't check if
> 'loading_timeout * HZ' overflows.
Indeed.
> > So effectively this code:
> >
> > _request_firmware()
> > {
> > ...
> > timeout = usermodehelper_read_lock_wait(timeout);
> > if (!timeout) {
> > dev_dbg(device, "firmware: %s loading timed out\n",
> > name);
> > ret = -EBUSY;
> > goto out;
> > }
> > ...
> > }
> >
> > Will always returns -EBUSY if timeout is 0.
>
> But timeout is never 0 unless I got that wrong.
No indeed, my mistake.
> > Our current default is 60 even for !CONFIG_FW_LOADER_USER_HELPER kernels, in
> > that case what this code does is only on FW_OPT_NOWAIT calls
> > (request_firmware_nowait()) it will wait for 60 seconds for the freezer to
> > clear.
> >
> > In Daniel's code right now this would change the default timeout to 0,
>
> No, it shouldn't. With this patch 1, the usermode helper code
> is not involved anymore, that's the main difference.
Indeed. I meant to say that if the timeout was 0 it would not be effectively
for using the umh lock for what we though we wanted before. As you clarify
though your changes do not set the default timeout to 0 though.
> > but that
> > would effectively break all request_firmware_nowait() calls. If we want to
> > keep the same functionality we'd have to keep the default timeout set to 60
> > then for !CONFIG_FW_LOADER_USER_HELPER.
> >
> > For the non-FW_OPT_NOWAIT (regular sync request_firmware() calls) cases we have:
> >
> > _request_firmware()
> > {
> > ...
> > ret = usermodehelper_read_trylock();
> > if (WARN_ON(ret)) {
> > dev_err(device, "firmware: %s will not be loaded\n",
> > name);
> > goto out;
> > }
> > ...
> > }
> >
> > This will do the same loop but if we are freezing (UMH_FREEZING,) it *will*
> > take the time to call try_to_freeze(). If we've hit UMH_DISABLED we bail
> > with -EAGAIN. Contrary to the async case this doesn't time out, so it
> > just loops forever and schedules()'s in hopes we eventually change state.
> >
> > So.. this would have to be kept if we want to keep the same behaviour.
>
> With !CONFIG_FW_LOADER_HELPER this series is missing warning when it is
> still too early for direct loading.
Well no, the commit that added the umh lock to firmware_class mentions the
purpose was to warn for users of the firmware API on resume. My point is that
folks should not rely on this lock to assume rootfs is ready, its simply not
true. As I noted in a proposed RFC and followed by criticism -- this requires
more work. For now the SmPL grammar rule should start helping us hunt for users
of these APIs on init/probe.
PS. I'll be gone on vacation for 2 weeks now.
Luis