Re: [PATCH v2] blktrace: re-write setting q->blk_trace

From: Jeff Moyer
Date: Thu Oct 29 2015 - 15:36:08 EST


Davidlohr Bueso <dave@xxxxxxxxxxxx> writes:

>>The patch itself looks ok, but it doesn't seem to apply to a recent
>>kernel tree. It appears as though it is white-space damaged. Would you
>>mind re-sending it?
>
> Hmm sorry about that. I thought I had based it against that day's tip/master.
> Anyway, here is v3 which is against tip/master as of 338e29ba93639138fafb9fb5ba946fd99a512aae.

Thanks.

Reviewed-by: Jeff Moyer <jmoyer@xxxxxxxxxx>


> 8<--------------------------------------------------------------------------------
> From: Davidlohr Bueso <dave@xxxxxxxxxxxx>
> Date: Thu, 29 Oct 2015 12:13:24 -0700
> Subject: [PATCH -tip v3] blktrace: re-write setting q->blk_trace
>
> This is really about simplifying the double xchg patterns into
> a single cmpxchg, with the same logic. Other than the immediate
> cleanup, there are some subtleties this change deals with:
>
> (i) While the load of the old bt is fully ordered wrt everything,
> ie:
>
> old_bt = xchg(&q->blk_trace, bt); [barrier]
> if (old_bt)
> (void) xchg(&q->blk_trace, old_bt); [barrier]
>
> blk_trace could still be changed between the xchg and the old_bt
> load. Note that this description is merely theoretical and afaict
> very small, but doing everything in a single context with cmpxchg
> closes this potential race.
>
> (ii) Ordering guarantees are obviously kept with cmpxchg.
>
> (iii) Gets rid of the hacky-by-nature (void)xchg pattern.
>
> Signed-off-by: Davidlohr Bueso <dbueso@xxxxxxx>
> ---
> v3: rebased ontop of today's -tip.
> minor changelog addition.
>
> kernel/trace/blktrace.c | 16 +++++-----------
> 1 file changed, 5 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 90e72a0..e3a2618 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -437,7 +437,7 @@ int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> struct block_device *bdev,
> struct blk_user_trace_setup *buts)
> {
> - struct blk_trace *old_bt, *bt = NULL;
> + struct blk_trace *bt = NULL;
> struct dentry *dir = NULL;
> int ret;
>
> @@ -519,11 +519,8 @@ int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> bt->trace_state = Blktrace_setup;
>
> ret = -EBUSY;
> - old_bt = xchg(&q->blk_trace, bt);
> - if (old_bt) {
> - (void) xchg(&q->blk_trace, old_bt);
> + if (cmpxchg(&q->blk_trace, NULL, bt))
> goto err;
> - }
>
> if (atomic_inc_return(&blk_probes_ref) == 1)
> blk_register_tracepoints();
> @@ -1481,7 +1478,7 @@ static int blk_trace_remove_queue(struct request_queue *q)
> static int blk_trace_setup_queue(struct request_queue *q,
> struct block_device *bdev)
> {
> - struct blk_trace *old_bt, *bt = NULL;
> + struct blk_trace *bt = NULL;
> int ret = -ENOMEM;
>
> bt = kzalloc(sizeof(*bt), GFP_KERNEL);
> @@ -1497,12 +1494,9 @@ static int blk_trace_setup_queue(struct request_queue *q,
>
> blk_trace_setup_lba(bt, bdev);
>
> - old_bt = xchg(&q->blk_trace, bt);
> - if (old_bt != NULL) {
> - (void)xchg(&q->blk_trace, old_bt);
> - ret = -EBUSY;
> + ret = -EBUSY;
> + if (cmpxchg(&q->blk_trace, NULL, bt))
> goto free_bt;
> - }
>
> if (atomic_inc_return(&blk_probes_ref) == 1)
> blk_register_tracepoints();
> --
> 2.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/