Re: [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls
From: Greg KH
Date: Wed Apr 29 2020 - 05:49:42 EST
On Wed, Apr 29, 2020 at 07:46:26AM +0000, Luis Chamberlain wrote:
> We use one blktrace per request_queue, that means one per the entire
> disk. So we cannot run one blktrace on say /dev/vda and then /dev/vda1,
> or just two calls on /dev/vda.
>
> We check for concurrent setup only at the very end of the blktrace setup though.
>
> If we try to run two concurrent blktraces on the same block device the
> second one will fail, and the first one seems to go on. However when
> one tries to kill the first one one will see things like this:
>
> The kernel will show these:
>
> ```
> debugfs: File 'dropped' in directory 'nvme1n1' already present!
> debugfs: File 'msg' in directory 'nvme1n1' already present!
> debugfs: File 'trace0' in directory 'nvme1n1' already present!
> ``
>
> And userspace just sees this error message for the second call:
>
> ```
> blktrace /dev/nvme1n1
> BLKTRACESETUP(2) /dev/nvme1n1 failed: 5/Input/output error
> ```
>
> The first userspace process #1 will also claim that the files
> were taken underneath their nose as well. The files are taken
> away form the first process given that when the second blktrace
> fails, it will follow up with a BLKTRACESTOP and BLKTRACETEARDOWN.
> This means that even if go-happy process #1 is waiting for blktrace
> data, we *have* been asked to take teardown the blktrace.
>
> This can easily be reproduced with break-blktrace [0] run_0005.sh test.
>
> Just break out early if we know we're already going to fail, this will
> prevent trying to create the files all over again, which we know still
> exist.
>
> [0] https://github.com/mcgrof/break-blktrace
> Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> ---
> kernel/trace/blktrace.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 5c52976bd762..383045f67cb8 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -4,6 +4,8 @@
> *
> */
>
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> #include <linux/kernel.h>
> #include <linux/blkdev.h>
> #include <linux/blktrace_api.h>
> @@ -516,6 +518,11 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> */
> strreplace(buts->name, '/', '_');
>
> + if (q->blk_trace) {
> + pr_warn("Concurrent blktraces are not allowed\n");
> + return -EBUSY;
You have access to a block device here, please use dev_warn() instead
here for that, that makes it obvious as to what device a "concurrent
blktrace" was attempted for.
thanks,
greg k-h