Re: [PATCH v1] firmware_class: encapsulate firmware loading status

From: Luis R. Rodriguez
Date: Tue Aug 09 2016 - 21:10:19 EST


On Thu, Aug 04, 2016 at 02:27:16PM +0200, Daniel Wagner wrote:
> From: Daniel Wagner <daniel.wagner@xxxxxxxxxxxx>
>
> The firmware user helper code tracks the current state of the loading
> process via an member of struct firmware_buf and a completion. Let's
> encapsulate this simple state machine into struct fw_status. The aim is
> to encrease readiblity and reduce the usage of the fw_lock.

Great, emphasis, reduce use of fw_lock, good stuff!

> The fw_lock is not needed to protect the status update anymore. We don't
> do any RMW operations. Instead we just do a write or a read, not both at
> the same time.
>
> [v1: moved fw_status into !CONFIG_FW_LOADER_USER_HELPER section,
> reported by 0day kbuild]
>
> Signed-off-by: Daniel Wagner <daniel.wagner@xxxxxxxxxxxx>
> Cc: Ming Lei <ming.lei@xxxxxxxxxxxxx>
> Cc: Luis R. Rodriguez <mcgrof@xxxxxxxxxx>
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> ---
>
> Hi,
>
> In [0] we have a discussion on how the firmware_class API might be
> changed to improve the current handling of firmware loading. This
> patch was part of the orignal RFC which triggered the discussion.
>
> I think it is worth taking this one anyway. Maybe as I suggested it
> could be part of the series from Luis.

I'm happy to queue it in on my end however your changes are a bit orthogonal
as you help optimize us away from the usermode helper, I just compartamentalize
that whole API away into a new one so this can go in separately. In terms of
coordination -- sure order will help to get right so I can queue it in, in
that sense. But we're not yet sure if sysdsata will go in first, and I'm happy
for this to go in first as it does not conflict as its slightly orthogonal.

So order here does not interfere with my series -- lets just review this and
its good lets let it go in.

What you do is strip us further from the user mode helper and that
is a good thing.

My review below.

> It cleans up the code base (okay my opinion)

You do little to sell this. In fact, if this is OK, it does a good
compartamentalization of a completion and a lock and implicates the
wait stuff only onto the usermoder helper, indeed that's a win
worth documenting on the commit log.

> and removes the
> complete_all() call which is problematic for -rt. complete_all() can
> be used in any context including IRQ.

I see. But in this case the code in question should never run in IRQ context?

> That could lead to unbounded
> work in the IRQ context and that is a no go for -rt.

Is the fear of the call to be used in IRQ context or the waiters to
somehow work in IRQ context somehow. The waiters were sleeping.. so
that I think leaves only the call site of the complete_all() to worry
about, but I can't see that happening in IRQ context. Please
correct me if I'm wrong.

> So here the
> attempt to reduce the number of complete_all() calls where possible.

OK so this is the real motivation.

> I have left this argument out in the commit message because I was told '-rt'
> arguments don't count for inclusion.

Sure, but I appreciate this explanation, thanks for that !

Can you provide a set of commits accepted upstream or on linux-next
where such conversion has been done and accepted as well elsewhere
in the kernel ?

I know its just pending patches for review but this has me thinking, is
the use of async functionality in the sysdata patches kosher for RT ?

> cheers,
> daniel
>
> [0] http://www.spinics.net/lists/linux-wireless/msg153005.html
>
> drivers/base/firmware_class.c | 154 ++++++++++++++++++++++++------------------
> 1 file changed, 89 insertions(+), 65 deletions(-)
>
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index 22d1760..33eb372 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -30,6 +30,7 @@
> #include <linux/syscore_ops.h>
> #include <linux/reboot.h>
> #include <linux/security.h>
> +#include <linux/swait.h>
>
> #include <generated/utsrelease.h>
>
> @@ -91,19 +92,83 @@ static inline bool fw_is_builtin_firmware(const struct firmware *fw)
> }
> #endif
>
> +static int loading_timeout = 60; /* In seconds */
> +
> +static inline long firmware_loading_timeout(void)
> +{
> + return loading_timeout > 0 ? loading_timeout * HZ : MAX_JIFFY_OFFSET;
> +}

Seems like we can wrap the above loading_timeout and firmware_loading_timeout onto
CONFIG_FW_LOADER_USER_HELPER -- or provide a helper that returns some
static nonsense value that works for !CONFIG_FW_LOADER_USER_HELPER.

The move of the code above also makes this change harder to review.

> +
> enum {
> + FW_STATUS_UNKNOWN,
> FW_STATUS_LOADING,
> FW_STATUS_DONE,
> - FW_STATUS_ABORT,
> + FW_STATUS_ABORTED,
> };

Come to think of it, even if CONFIG_FW_LOADER_USER_HELPER is enabled
we should only have a need to use this wait crap if an explicit
caller in the kernel requested to use the usermode helper, and as
my patches show there are only 2 of those cases left in the kernel.

To be clear if CONFIG_FW_LOADER_USER_HELPER_FALLBACK (not many distros left) is
set we're stuck and always have to use this, if you only have
CONFIG_FW_LOADER_USER_HELPER (most distros) then only explicit calls
will need this. I really don't want to be adding further crap for the
the user mode helper but since you are optimizing this I think a
compromise here can be to extend the status further for when an explicit
caller with usermode helper is passed, FW_STATUS_WAIT, then we only
really need to wait CONFIG_FW_LOADER_USER_HELPER_FALLBACK or if
FW_STATUS_WAIT is set.

For all intents and purposes this is 0 times for most distros these days
given we have only 2 drivers using this crap explicitly right now. But
I think this can be a separate atomic change.

> -static int loading_timeout = 60; /* In seconds */
> +#ifdef CONFIG_FW_LOADER_USER_HELPER
> +struct fw_status {
> + unsigned long status;
> + struct swait_queue_head wq;
> +};
>
> -static inline long firmware_loading_timeout(void)
> +static void fw_status_init(struct fw_status *fw_st)
> {
> - return loading_timeout > 0 ? loading_timeout * HZ : MAX_JIFFY_OFFSET;
> + fw_st->status = FW_STATUS_UNKNOWN;
> + init_swait_queue_head(&fw_st->wq);
> +}
> +
> +static inline bool is_fw_status_done(unsigned long status)
> +{
> + return status == FW_STATUS_DONE ||
> + status == FW_STATUS_ABORTED;
> +}
> +
> +static int fw_status_wait_timeout(struct fw_status *fw_st, long timeout)
> +{
> + int err;
> + err = swait_event_interruptible_timeout(fw_st->wq,
> + is_fw_status_done(READ_ONCE(fw_st->status)),
> + timeout);
> + if (err == 0 && fw_st->status == FW_STATUS_ABORTED)
> + return -ENOENT;
> +
> + return err;
> +}
> +
> +static void __fw_status_set(struct fw_status *fw_st,
> + unsigned long status)
> +{
> + WRITE_ONCE(fw_st->status, status);
> +
> + if (status == FW_STATUS_DONE ||
> + status == FW_STATUS_ABORTED)
> + swake_up(&fw_st->wq);
> +}
> +
> +static unsigned long __fw_status_get(struct fw_status *fw_st)
> +{
> + return READ_ONCE(fw_st->status);
> }
>
> +#define fw_status_start(fw_st) \
> + __fw_status_set(fw_st, FW_STATUS_LOADING)
> +#define fw_status_done(fw_st) \
> + __fw_status_set(fw_st, FW_STATUS_DONE)
> +#define fw_status_aborted(fw_st) \
> + __fw_status_set(fw_st, FW_STATUS_ABORTED)
> +#define fw_status_is_loading(fw_st) \
> + (__fw_status_get(fw_st) == FW_STATUS_LOADING)
> +#define fw_status_is_done(fw_st) \
> + (__fw_status_get(fw_st) == FW_STATUS_DONE)
> +#define fw_status_is_aborted(fw_st) \
> + (__fw_status_get(fw_st) == FW_STATUS_ABORTED)
> +#else
> +#define fw_status_wait_timeout(fw_st, timeout) 0
> +#define fw_status_done(fw_st)
> +#define fw_status_is_aborted(fw_st) false
> +#endif /* CONFIG_FW_LOADER_USER_HELPER */
> +
> /* firmware behavior options */
> #define FW_OPT_UEVENT (1U << 0)
> #define FW_OPT_NOWAIT (1U << 1)
> @@ -145,13 +210,12 @@ struct firmware_cache {
> struct firmware_buf {
> struct kref ref;
> struct list_head list;
> - struct completion completion;
> struct firmware_cache *fwc;
> - unsigned long status;
> void *data;
> size_t size;
> size_t allocated_size;
> #ifdef CONFIG_FW_LOADER_USER_HELPER
> + struct fw_status fw_st;
> bool is_paged_buf;
> bool need_uevent;
> struct page **pages;
> @@ -205,8 +269,8 @@ static struct firmware_buf *__allocate_fw_buf(const char *fw_name,
> buf->fwc = fwc;
> buf->data = dbuf;
> buf->allocated_size = size;
> - init_completion(&buf->completion);
> #ifdef CONFIG_FW_LOADER_USER_HELPER
> + fw_status_init(&buf->fw_st);
> INIT_LIST_HEAD(&buf->pending_list);
> #endif
>
> @@ -305,15 +369,6 @@ static const char * const fw_path[] = {
> module_param_string(path, fw_path_para, sizeof(fw_path_para), 0644);
> MODULE_PARM_DESC(path, "customized firmware image search path with a higher priority than default path");
>
> -static void fw_finish_direct_load(struct device *device,
> - struct firmware_buf *buf)
> -{
> - mutex_lock(&fw_lock);
> - set_bit(FW_STATUS_DONE, &buf->status);
> - complete_all(&buf->completion);
> - mutex_unlock(&fw_lock);
> -}
> -
> static int
> fw_get_filesystem_firmware(struct device *device, struct firmware_buf *buf)
> {
> @@ -360,7 +415,7 @@ fw_get_filesystem_firmware(struct device *device, struct firmware_buf *buf)
> }
> dev_dbg(device, "direct-loading %s\n", buf->fw_id);
> buf->size = size;
> - fw_finish_direct_load(device, buf);
> + fw_status_done(&buf->fw_st);
> break;
> }
> __putname(path);
> @@ -478,12 +533,11 @@ static void __fw_load_abort(struct firmware_buf *buf)
> * There is a small window in which user can write to 'loading'
> * between loading done and disappearance of 'loading'
> */
> - if (test_bit(FW_STATUS_DONE, &buf->status))
> + if (fw_status_is_done(&buf->fw_st))
> return;
>
> list_del_init(&buf->pending_list);
> - set_bit(FW_STATUS_ABORT, &buf->status);
> - complete_all(&buf->completion);
> + fw_status_aborted(&buf->fw_st);
> }
>
> static void fw_load_abort(struct firmware_priv *fw_priv)
> @@ -496,9 +550,6 @@ static void fw_load_abort(struct firmware_priv *fw_priv)
> fw_priv->buf = NULL;
> }
>
> -#define is_fw_load_aborted(buf) \
> - test_bit(FW_STATUS_ABORT, &(buf)->status)
> -
> static LIST_HEAD(pending_fw_head);
>
> /* reboot notifier for avoid deadlock with usermode_lock */
> @@ -596,10 +647,8 @@ static ssize_t firmware_loading_show(struct device *dev,
> struct firmware_priv *fw_priv = to_firmware_priv(dev);
> int loading = 0;
>
> - mutex_lock(&fw_lock);
> if (fw_priv->buf)
> - loading = test_bit(FW_STATUS_LOADING, &fw_priv->buf->status);
> - mutex_unlock(&fw_lock);
> + loading = fw_status_is_loading(&fw_priv->buf->fw_st);
>
> return sprintf(buf, "%d\n", loading);
> }
> @@ -653,23 +702,20 @@ static ssize_t firmware_loading_store(struct device *dev,
> switch (loading) {
> case 1:
> /* discarding any previous partial load */
> - if (!test_bit(FW_STATUS_DONE, &fw_buf->status)) {
> + if (!fw_status_is_done(&fw_buf->fw_st)) {
> for (i = 0; i < fw_buf->nr_pages; i++)
> __free_page(fw_buf->pages[i]);
> vfree(fw_buf->pages);
> fw_buf->pages = NULL;
> fw_buf->page_array_size = 0;
> fw_buf->nr_pages = 0;
> - set_bit(FW_STATUS_LOADING, &fw_buf->status);
> + fw_status_start(&fw_buf->fw_st);
> }
> break;
> case 0:
> - if (test_bit(FW_STATUS_LOADING, &fw_buf->status)) {
> + if (fw_status_is_loading(&fw_buf->fw_st)) {
> int rc;
>
> - set_bit(FW_STATUS_DONE, &fw_buf->status);
> - clear_bit(FW_STATUS_LOADING, &fw_buf->status);
> -
> /*
> * Several loading requests may be pending on
> * one same firmware buf, so let all requests
> @@ -691,10 +737,11 @@ static ssize_t firmware_loading_store(struct device *dev,
> */
> list_del_init(&fw_buf->pending_list);
> if (rc) {
> - set_bit(FW_STATUS_ABORT, &fw_buf->status);
> + fw_status_aborted(&fw_buf->fw_st);
> written = rc;
> + } else {
> + fw_status_done(&fw_buf->fw_st);
> }
> - complete_all(&fw_buf->completion);
> break;
> }
> /* fallthrough */
> @@ -702,7 +749,7 @@ static ssize_t firmware_loading_store(struct device *dev,
> dev_err(dev, "%s: unexpected value (%d)\n", __func__, loading);
> /* fallthrough */
> case -1:
> - fw_load_abort(fw_priv);
> + fw_status_aborted(&fw_buf->fw_st);
> break;
> }
> out:
> @@ -755,7 +802,7 @@ static ssize_t firmware_data_read(struct file *filp, struct kobject *kobj,
>
> mutex_lock(&fw_lock);
> buf = fw_priv->buf;
> - if (!buf || test_bit(FW_STATUS_DONE, &buf->status)) {
> + if (!buf || fw_status_is_done(&buf->fw_st)) {
> ret_count = -ENODEV;
> goto out;
> }
> @@ -842,7 +889,7 @@ static ssize_t firmware_data_write(struct file *filp, struct kobject *kobj,
>
> mutex_lock(&fw_lock);
> buf = fw_priv->buf;
> - if (!buf || test_bit(FW_STATUS_DONE, &buf->status)) {
> + if (!buf || fw_status_is_done(&buf->fw_st)) {
> retval = -ENODEV;
> goto out;
> }
> @@ -955,8 +1002,7 @@ static int _request_firmware_load(struct firmware_priv *fw_priv,
> timeout = MAX_JIFFY_OFFSET;
> }
>
> - retval = wait_for_completion_interruptible_timeout(&buf->completion,
> - timeout);
> + retval = fw_status_wait_timeout(&buf->fw_st, timeout);
> if (retval == -ERESTARTSYS || !retval) {
> mutex_lock(&fw_lock);
> fw_load_abort(fw_priv);
> @@ -965,7 +1011,7 @@ static int _request_firmware_load(struct firmware_priv *fw_priv,
> retval = 0;
> }
>
> - if (is_fw_load_aborted(buf))
> + if (fw_status_is_aborted(&buf->fw_st))
> retval = -EAGAIN;
> else if (buf->is_paged_buf && !buf->data)
> retval = -ENOMEM;
> @@ -1015,35 +1061,12 @@ fw_load_from_user_helper(struct firmware *firmware, const char *name,
> return -ENOENT;
> }
>
> -/* No abort during direct loading */
> -#define is_fw_load_aborted(buf) false
> -
> #ifdef CONFIG_PM_SLEEP
> static inline void kill_requests_without_uevent(void) { }
> #endif
>
> #endif /* CONFIG_FW_LOADER_USER_HELPER */
>
> -
> -/* wait until the shared firmware_buf becomes ready (or error) */
> -static int sync_cached_firmware_buf(struct firmware_buf *buf)
> -{
> - int ret = 0;
> -
> - mutex_lock(&fw_lock);
> - while (!test_bit(FW_STATUS_DONE, &buf->status)) {
> - if (is_fw_load_aborted(buf)) {
> - ret = -ENOENT;
> - break;
> - }
> - mutex_unlock(&fw_lock);
> - ret = wait_for_completion_interruptible(&buf->completion);
> - mutex_lock(&fw_lock);
> - }
> - mutex_unlock(&fw_lock);
> - return ret;
> -}
> -
> /* prepare firmware and firmware_buf structs;
> * return 0 if a firmware is already assigned, 1 if need to load one,
> * or a negative error code
> @@ -1077,7 +1100,8 @@ _request_firmware_prepare(struct firmware **firmware_p, const char *name,
> firmware->priv = buf;
>
> if (ret > 0) {
> - ret = sync_cached_firmware_buf(buf);
> + ret = fw_status_wait_timeout(&buf->fw_st,
> + firmware_loading_timeout());
> if (!ret) {
> fw_set_page_data(buf, firmware);
> return 0; /* assigned */
> @@ -1095,7 +1119,7 @@ static int assign_firmware_buf(struct firmware *fw, struct device *device,
> struct firmware_buf *buf = fw->priv;
>
> mutex_lock(&fw_lock);
> - if (!buf->size || is_fw_load_aborted(buf)) {
> + if (!buf->size || fw_status_is_aborted(&buf->fw_st)) {
> mutex_unlock(&fw_lock);
> return -ENOENT;
> }

Other than that this looks nice so far. Can you please run the tests

tools/testing/selftests/firmware/fw_filesystem.sh
tools/testing/selftests/firmware/fw_userhelper.sh

against lib/test_firmware.c

Luis