Re: [PATCH v2] eeprom: at24: Fix unexpected timeout under high load

From: Bartosz Golaszewski
Date: Fri Aug 17 2018 - 05:01:12 EST


2018-08-16 19:45 GMT+02:00 Mark Jonas <mark.jonas@xxxxxxxxxxxx>:
> From: Wang Xin <xin.wang7@xxxxxxxxxxxx>
>
> Within at24_loop_until_timeout the timestamp used for timeout checking
> is recorded after the I2C transfer and sleep_range(). Under high CPU
> load either the execution time for I2C transfer or sleep_range() could
> actually be larger than the timeout value. Worst case the I2C transfer
> is only tried once because the loop will exit due to the timeout
> although the EEPROM is now ready.
>
> To fix this issue the timestamp is recorded at the beginning of each
> iteration. That is, before I2C transfer and sleep. Then the timeout
> is actually checked against the timestamp of the previous iteration.
> This makes sure that even if the timeout is reached, there is still one
> more chance to try the I2C transfer in case the EEPROM is ready.
>
> Example:
>
> If you have a system which combines high CPU load with repeated EEPROM
> writes you will run into the following scenario.
>
> - System makes a successful regmap_bulk_write() to EEPROM.
> - System wants to perform another write to EEPROM but EEPROM is still
> busy with the last write.
> - Because of high CPU load the usleep_range() will sleep more than
> 25 ms (at24_write_timeout).
> - Within the over-long sleeping the EEPROM finished the previous write
> operation and is ready again.
> - at24_loop_until_timeout() will detect timeout and won't try to write.
>
> Signed-off-by: Wang Xin <xin.wang7@xxxxxxxxxxxx>
> Signed-off-by: Mark Jonas <mark.jonas@xxxxxxxxxxxx>
> ---
> Changes in v2:
> - Remove loop macro
> - Accept superfluous sleep in case of timeout
> - Add concrete example to commit message
> ---
> drivers/misc/eeprom/at24.c | 43 ++++++++++++++++++++++---------------------
> 1 file changed, 22 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
> index f5cc517..37a5c8b 100644
> --- a/drivers/misc/eeprom/at24.c
> +++ b/drivers/misc/eeprom/at24.c
> @@ -106,23 +106,6 @@ static unsigned int at24_write_timeout = 25;
> module_param_named(write_timeout, at24_write_timeout, uint, 0);
> MODULE_PARM_DESC(at24_write_timeout, "Time (in ms) to try writes (default 25)");
>
> -/*
> - * Both reads and writes fail if the previous write didn't complete yet. This
> - * macro loops a few times waiting at least long enough for one entire page
> - * write to work while making sure that at least one iteration is run before
> - * checking the break condition.
> - *
> - * It takes two parameters: a variable in which the future timeout in jiffies
> - * will be stored and a temporary variable holding the time of the last
> - * iteration of processing the request. Both should be unsigned integers
> - * holding at least 32 bits.
> - */
> -#define at24_loop_until_timeout(tout, op_time) \
> - for (tout = jiffies + msecs_to_jiffies(at24_write_timeout), \
> - op_time = 0; \
> - op_time ? time_before(op_time, tout) : true; \
> - usleep_range(1000, 1500), op_time = jiffies)
> -
> struct at24_chip_data {
> /*
> * these fields mirror their equivalents in
> @@ -308,13 +291,22 @@ static ssize_t at24_regmap_read(struct at24_data *at24, char *buf,
> /* adjust offset for mac and serial read ops */
> offset += at24->offset_adj;
>
> - at24_loop_until_timeout(timeout, read_time) {
> + timeout = jiffies + msecs_to_jiffies(at24_write_timeout);
> + do {
> + /*
> + * The timestamp shall be taken before the actual operation
> + * to avoid a premature timeout in case of high CPU load.
> + */
> + read_time = jiffies;
> +
> ret = regmap_bulk_read(regmap, offset, buf, count);
> dev_dbg(&client->dev, "read %zu@%d --> %d (%ld)\n",
> count, offset, ret, jiffies);
> if (!ret)
> return count;
> - }
> +
> + usleep_range(1000, 1500);
> + } while (time_before(read_time, timeout));
>
> return -ETIMEDOUT;
> }
> @@ -358,14 +350,23 @@ static ssize_t at24_regmap_write(struct at24_data *at24, const char *buf,
> regmap = at24_client->regmap;
> client = at24_client->client;
> count = at24_adjust_write_count(at24, offset, count);
> + timeout = jiffies + msecs_to_jiffies(at24_write_timeout);
> +
> + do {
> + /*
> + * The timestamp shall be taken before the actual operation
> + * to avoid a premature timeout in case of high CPU load.
> + */
> + write_time = jiffies;
>
> - at24_loop_until_timeout(timeout, write_time) {
> ret = regmap_bulk_write(regmap, offset, buf, count);
> dev_dbg(&client->dev, "write %zu@%d --> %d (%ld)\n",
> count, offset, ret, jiffies);
> if (!ret)
> return count;
> - }
> +
> + usleep_range(1000, 1500);
> + } while (time_before(write_time, timeout));
>
> return -ETIMEDOUT;
> }
> --
> 2.7.4
>

Looks good to me, I'll test it after v4.19-rc1 is tagged.

Bart