Re: [PATCHv2 0/3] Enable no_cache flag to driver_data

From: Li, Yi
Date: Tue Jun 06 2017 - 15:34:11 EST

Next message: Stefani Seibold: "Re: [PATCH] external references for device tree overlays"
Previous message: Bjorn Andersson: "Re: [PATCH 05/16] rpmsg: use dev_groups and not dev_attrs for bus_type"
Next in thread: Luis R. Rodriguez: "Re: [PATCHv2 0/3] Enable no_cache flag to driver_data"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Luis,

On 5/24/2017 2:03 PM, Luis R. Rodriguez wrote:

On Sat, May 20, 2017 at 01:46:56AM -0500, yi1.li@xxxxxxxxxxxxxxx wrote:

From: Yi Li <yi1.li@xxxxxxxxxxxxxxx>

Changes in v2:

- Rebase to Luis R. Rodriguez's 20170501-driver-data-try2
branch
- Expose DRIVER_DATA_REQ_NO_CACHE flag to public
driver_data_req_params structure, so upper drivers can ask
driver_data driver to bypass the internal caching mechanism.
This will be used for streaming and other drivers maintains
their own caching like iwlwifi.
- Add self test cases.

Yi Li (3):
firmware_class: move NO_CACHE from private to driver_data_req_params
iwlwifi: use DRIVER_DATA_REQ_NO_CACHE for driver_data
test: add no_cache to driver_data load tester

drivers/base/firmware_class.c | 16 ++++-----
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 ++
include/linux/driver_data.h | 4 +++
lib/test_driver_data.c | 43 +++++++++++++++++++++++--
tools/testing/selftests/firmware/driver_data.sh | 36 +++++++++++++++++++++
5 files changed, 89 insertions(+), 12 deletions(-)

Good stuff, this series is looking good and very easy to read ! Only thing
though -- the cache test is just setting up the cache and ensuring it gets set,
it doesn't really *test* the cache is functional. Can you devise a test which
does ensure the cache is functional ?

We use the cache upon suspend to cache the firmware so that upon resume a
request will use that cache, to avoid the file lookup on disk. Doing a test
with qemu suspend + resume is possible but that requires having access to
the qemu monitor interface and doing something like this to trigger a wakeup:

echo system_wakeup | socat - /dev/pts/7,raw,echo=0,crnl

where /dev/pts/7 is assumed to be the PTY. This is rather complex for ksefltest,
so another option is to add a debug mode debugfs interface for firmware_class where
we can force-enable the cache mechanism without actually this being prompted by a
suspend/hibernate. Then we can just toggle this bit from a testing perspective and
excercise the caching mechanism.

So if you look at fw_pm_notify()

switch (mode) {
case PM_HIBERNATION_PREPARE:
case PM_SUSPEND_PREPARE:
case PM_RESTORE_PREPARE:
/*
* kill pending fallback requests with a custom fallback
* to avoid stalling suspend.
*/
kill_pending_fw_fallback_reqs(true);
device_cache_fw_images();
disable_firmware();
break;

I am studying at the firmware caching codes and have to say it's very complicated. :-( Here are some questions:

1. Since device_cache_fw_images invokes dev_cache_fw_image through dpm_for_each_dev, adding a debugfs driver to kick it can only cache firmware for those associated with devices which has PM enabled, which do not include the driver_data_test_device. Any suggestions?

2. Look into dev_cache_fw_image function, devres_for_each_res will walk through the firmware have been loaded before (through assign_firmware_buf -> fw_add_devm_name) and add to the todo list, eventually it will create the fw_names list. So in the test driver, we need to load the firmware once before calling the kick?
dev_cache_fw_image(struct device *dev, void *data)

{

LIST_HEAD(todo);

struct fw_cache_entry *fce;

struct fw_cache_entry *fce_next;

struct firmware_cache *fwc = &fw_cache;

devres_for_each_res(dev, fw_name_devm_release,

devm_name_match, &fw_cache,

dev_create_fw_entry, &todo);

list_for_each_entry_safe(fce, fce_next, &todo, list) {

list_del(&fce->list);

spin_lock(&fwc->name_lock);

/* only one cache entry for one firmware */

if (!__fw_entry_found(fce->name)) {

list_add(&fce->list, &fwc->fw_names);

} else {

free_fw_cache_entry(fce);

...
}

kill_pending_fw_fallback_reqs(true), device_cache_fw_images(), and
disable_firmware() could be folder into a helper, then a debugfs interface
could kick that into action to put us cache mode as the
device_cache_fw_images() changes the cache state to FW_LOADER_START_CACHE, and
when this is done you'll notice that assign_firmware_buf() does:

/*
* After caching firmware image is started, let it piggyback
* on request firmware.
*/
if (!driver_data_param_nocache(&data_params->priv_params) &&
buf->fwc->state == FW_LOADER_START_CACHE) {
if (fw_cache_piggyback_on_request(buf->fw_id))
kref_get(&buf->ref);
}

Which adds an incoming request to the cache. The first request that adds this cache
entry would be triggered by device_cache_fw_images() after the cache state is enabled:

static void device_cache_fw_images(void)
{
...
fwc->state = FW_LOADER_START_CACHE;
dpm_for_each_dev(NULL, dev_cache_fw_image);
...
}

This only applies to the devices have PM enabled.

Subsequent requests then lookup for the cache through _request_firmware_prepare()
when fw_lookup_and_allocate_buf() is called. __fw_lookup_buf() really should be
renamed to something that reflects this is a cache lookup. In fact if you find
anything else that needs renaming to make it clear please feel free to send patches
for it.

We want to test that when caching is enabled, the cache is actually used.

Note that disable_firmware() above on the notifier *does* disable subsequent firmware
lookups but this is only *if* the lookup fails with _request_firmware_prepare():

_request_firmware(const struct firmware **firmware_p, const char *name,
struct driver_data_params *data_params,
struct device *device)
{
struct firmware *fw = NULL;
int ret;
if (!firmware_p)
return -EINVAL;
if (!name || name[0] == '\0') {
ret = -EINVAL;
goto out;
}
ret = _request_firmware_prepare(&fw, name, device, data_params);
if (ret <= 0) /* error or already assigned */
goto out;
if (!firmware_enabled()) {
WARN(1, "firmware request while host is not available\n");
ret = -EHOSTDOWN;
goto out;
}
...
}

The idea is that _request_firmware_prepare() will have picked up the cache and
enabled use of that while the infrastructure for disk lookups is disabled. The
caching effect is lifted later on the same notifier fw_pm_notify() and it also
schedules a clearing of the cached firmwares with device_uncache_fw_images_delay().

device_uncache_fw_images_delay() will be called for PM_POST_RESTORE, which means the hibernation restore got error. How about the successful restore case, calling driver will free its firmware_buf after loading?

To me this last part smells like a possible source of issue (not sure) if we might
suspend/resume a lot in short period of time, this theory could be tested by toggling
on/of this debugfs interface I suggested while having requests of different types
blast in.

Agree, it might create an issue if the system is get into restore_prepare again before the device_uncache_fw_images_delay clear the cache, why we need the 10 * MSEC_PER_SEC delay? In theory, fwc->name_lock should protect the case though.

This is the sort of testing which would really help here.

Likewise, if you are extending functionality please consider ways to break it :)

Understand the need to test the firmware caching part. For non-caching test case, will it be enough if we can test that the noncache setting will ban the firmware name be added to the fwc->fw_names list?

and test against it. Please think about these things carefully, its what will
change the stability for the better long term of our loader infrastructure.

Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-fpga" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html

Next message: Stefani Seibold: "Re: [PATCH] external references for device tree overlays"
Previous message: Bjorn Andersson: "Re: [PATCH 05/16] rpmsg: use dev_groups and not dev_attrs for bus_type"
Next in thread: Luis R. Rodriguez: "Re: [PATCHv2 0/3] Enable no_cache flag to driver_data"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]