Re: [PATCH v1] block, bfq: set default slice_idle to zero for SSDs

From: Paolo Valente
Date: Mon Nov 18 2019 - 09:18:14 EST




> Il giorno 13 nov 2019, alle ore 15:53, Pradeep P V K <ppvk@xxxxxxxxxxxxxx> ha scritto:
>
> With default 8ms as a slice idle time, we seen few time bounded
> applications(sensors) on v4.19 kernel are getting timedout during
> multimedia tests (audio, video playbacks etc) with Reboots and
> leading to crash. The timeout configured for these applications
> (sensors) are 20sec.
>
> In crash dumps, we seen few synchronous requests from sensors/other
> applications were in their bfq_queues for more than 12-20sec.
>
> Idling due to anticipation of future near-by IO requests and wait on
> completion of submitted requests, will effect in choosing the next
> bfq-queue and its scheduling. There by it effecting some time bounded
> applications.
>
> After making the slice idle to zero,

As written in the comments that your patch modifies, idling is
essential for controlling I/O. So your change is
unacceptable unfortunately.

I would recommend you to analyze carefully why this anomaly occurs
with non-zero slice idle. I'd be willing to help in that.

Alternatively, if you don't want to waste your time finding out the
cause of this problem, then just set slice_idle to 0 manually for your
application.

> we didn't seen any crash during
> our 72hrs of testing and also it increases the IO throughput.
>
> Following FIO benchmark results were taken on a local SSD run:
>
> RandomReads that were taken on v4.19 kernel:
>
> Idling iops avg-lat(us) stddev bw
> ----------------------------------------------------
> On 4136 1189.07 17221.65 16.9MB/s
> Off 7246 670.11 1054.76 29.7MB/s
>

This is anomalous too. Probably the kernel you are using lacks
commits made after 4.19 (around one hundred commits IIRC).

Thanks,
Paolo

> fio --name=temp --size=5G --time_based --ioengine=sync \
> --randrepeat=0 --direct=1 --invalidate=1 --verify=0 \
> --verify_fatal=0 --rw=randread --blocksize=4k \
> --group_reporting=1 --directory=/data --runtime=10 \
> --iodepth=64 --numjobs=5
>
> Following code changes were made based on [1],[2] and [3].
>
> [1] https://lkml.org/lkml/2018/11/1/1285
> [2] Commit 41c0126b3f22 ("block: Make CFQ default to IOPS mode on
> SSDs")
> [3] Commit 0bb979472a74 ("cfq-iosched: fix the setting of IOPS mode on
> SSDs")
>
> Signed-off-by: Pradeep P V K <ppvk@xxxxxxxxxxxxxx>
> ---
> Documentation/block/bfq-iosched.rst | 7 ++++---
> block/bfq-iosched.c | 13 +++++++++++++
> block/elevator.c | 2 ++
> include/linux/elevator.h | 1 +
> 4 files changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/block/bfq-iosched.rst b/Documentation/block/bfq-iosched.rst
> index 0d237d4..244f4ca 100644
> --- a/Documentation/block/bfq-iosched.rst
> +++ b/Documentation/block/bfq-iosched.rst
> @@ -329,9 +329,10 @@ slice_idle
>
> This parameter specifies how long BFQ should idle for next I/O
> request, when certain sync BFQ queues become empty. By default
> -slice_idle is a non-zero value. Idling has a double purpose: boosting
> -throughput and making sure that the desired throughput distribution is
> -respected (see the description of how BFQ works, and, if needed, the
> +slice_idle is a non-zero value for rotational devices.
> +Idling has a double purpose: boosting throughput and making
> +sure that the desired throughput distribution is respected
> +(see the description of how BFQ works, and, if needed, the
> papers referred there).
>
> As for throughput, idling can be very helpful on highly seeky media
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 0319d63..9c994d1 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -6514,6 +6514,18 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
> return -ENOMEM;
> }
>
> +static void bfq_registered_queue(struct request_queue *q)
> +{
> + struct elevator_queue *e = q->elevator;
> + struct bfq_data *bfqd = e->elevator_data;
> +
> + /*
> + * Default to IOPS mode with no idling for SSDs
> + */
> + if (blk_queue_nonrot(q))
> + bfqd->bfq_slice_idle = 0;
> +}
> +
> static void bfq_slab_kill(void)
> {
> kmem_cache_destroy(bfq_pool);
> @@ -6761,6 +6773,7 @@ static ssize_t bfq_low_latency_store(struct elevator_queue *e,
> .init_hctx = bfq_init_hctx,
> .init_sched = bfq_init_queue,
> .exit_sched = bfq_exit_queue,
> + .elevator_registered_fn = bfq_registered_queue,
> },
>
> .icq_size = sizeof(struct bfq_io_cq),
> diff --git a/block/elevator.c b/block/elevator.c
> index 076ba73..b882d25 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -504,6 +504,8 @@ int elv_register_queue(struct request_queue *q, bool uevent)
> kobject_uevent(&e->kobj, KOBJ_ADD);
>
> e->registered = 1;
> + if (e->type->ops.elevator_registered_fn)
> + e->type->ops.elevator_registered_fn(q);
> }
> return error;
> }
> diff --git a/include/linux/elevator.h b/include/linux/elevator.h
> index 901bda3..23dcc35 100644
> --- a/include/linux/elevator.h
> +++ b/include/linux/elevator.h
> @@ -50,6 +50,7 @@ struct elevator_mq_ops {
> struct request *(*next_request)(struct request_queue *, struct request *);
> void (*init_icq)(struct io_cq *);
> void (*exit_icq)(struct io_cq *);
> + void (*elevator_registered_fn)(struct request_queue *q);
> };
>
> #define ELV_NAME_MAX (16)
> --
> 1.9.1
>