Re: [PATCH v8 04/12] kernfs: add initial failure injection support

From: Luis Chamberlain
Date: Mon Oct 11 2021 - 16:44:30 EST


On Tue, Oct 05, 2021 at 12:47:22PM -0700, Kees Cook wrote:
> On Mon, Sep 27, 2021 at 09:37:57AM -0700, Luis Chamberlain wrote:
> > This adds initial failure injection support to kernfs. We start
> > off with debug knobs which when enabled allow test drivers, such as
> > test_sysfs, to then make use of these to try to force certain
> > difficult races to take place with a high degree of certainty.
> >
> > This only adds runtime code *iff* the new bool CONFIG_FAIL_KERNFS_KNOBS is
> > enabled in your kernel. If you don't have this enabled this provides
> > no new functional. When CONFIG_FAIL_KERNFS_KNOBS is disabled the new
> > routine kernfs_debug_should_wait() ends up being transformed to if
> > (false), and so the compiler should optimize these out as dead code
> > producing no new effective binary changes.
> >
> > We start off with enabling failure injections in kernfs by allowing us to
> > alter the way kernfs_fop_write_iter() behaves. We allow for the routine
> > kernfs_fop_write_iter() to wait for a certain condition in the kernel to
> > occur, after which it will sleep a predefined amount of time. This lets
> > kernfs users to time exactly when it want kernfs_fop_write_iter() to
> > complete, allowing for developing race conditions and test for correctness
> > in kernfs.
> >
> > You'd boot with this enabled on your kernel command line:
> >
> > fail_kernfs_fop_write_iter=1,100,0,1
> >
> > The values are <interval,probability,size,times>, we don't care for
> > size, so for now we ignore it. The above ensures a failure will trigger
> > only once.
> >
> > *How* we allow for this routine to change behaviour is left to knobs we
> > expose under debugfs:
> >
> > # ls -1 /sys/kernel/debug/kernfs/config_fail_kernfs_fop_write_iter/
>
> I'd expect this to live under /sys/kernel/debug/fail_kernfs, like the
> other fault injectors.

Yes I see, thanks will fix up!

> > diff --git a/Documentation/fault-injection/fault-injection.rst b/Documentation/fault-injection/fault-injection.rst
> > index 4a25c5eb6f07..d4d34b082f47 100644
> > --- a/Documentation/fault-injection/fault-injection.rst
> > +++ b/Documentation/fault-injection/fault-injection.rst
> > @@ -28,6 +28,28 @@ Available fault injection capabilities
> >
> > injects kernel RPC client and server failures.
> >
> > +- fail_kernfs_fop_write_iter
> > +
> > + Allows for failures to be enabled inside kernfs_fop_write_iter(). Enabling
> > + this does not immediately enable any errors to occur. You must configure
> > + how you want this routine to fail or change behaviour by using the debugfs
> > + knobs for it:
> > +
> > + # ls -1 /sys/kernel/debug/kernfs/config_fail_kernfs_fop_write_iter/
> > + wait_after_active
> > + wait_after_mutex
> > + wait_at_start
> > + wait_before_mutex
>
> This should be split up and detailed in the "debugfs entries" section
> below here.

Done!

> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 1b4cefcb064c..fadfd961ad80 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -10384,7 +10384,7 @@ M: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > M: Tejun Heo <tj@xxxxxxxxxx>
> > S: Supported
> > T: git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git
> > -F: fs/kernfs/
> > +F: fs/kernfs/*
> > F: include/linux/kernfs.h
> >
> > KEXEC
> > diff --git a/fs/kernfs/Makefile b/fs/kernfs/Makefile
> > index 4ca54ff54c98..bc5b32ca39f9 100644
> > --- a/fs/kernfs/Makefile
> > +++ b/fs/kernfs/Makefile
> > @@ -4,3 +4,4 @@
> > #
> >
> > obj-y := mount.o inode.o dir.o file.o symlink.o
> > +obj-$(CONFIG_FAIL_KERNFS_KNOBS) += failure-injection.o
> > diff --git a/fs/kernfs/failure-injection.c b/fs/kernfs/failure-injection.c
> > new file mode 100644
> > index 000000000000..4130d202c13b
> > --- /dev/null
> > +++ b/fs/kernfs/failure-injection.c
>
> I'd name this fault_inject.c, which matches the more common case:
>
> $ find . -type f -name '*fault*inject*.c'
> ./fs/nfsd/fault_inject.c
> ./drivers/nvme/host/fault_inject.c
> ./drivers/scsi/ufs/ufs-fault-injection.c
> ./lib/fault-inject.c
> ./lib/fault-inject-usercopy.c

Sure, done.

> > +int __kernfs_debug_should_wait_kernfs_fop_write_iter(bool evaluate)
> > +{
> > + if (!evaluate)
> > + return 0;
> > +
> > + return should_fail(&fail_kernfs_fop_write_iter, 0);
> > +}
>
> Every caller ends up doing the wait, so how about just including that
> here instead? It should make things much less intrusive and more readable.
>
> And for the naming, other fault injectors use "should_fail_$topic", so
> maybe better here would be something like may_wait_kernfs(...).

In case anyone is reading Hail Mary by Andy Weir: "Yes yes yes!"

Indeed, that's a great idea. Changed!

> > +
> > +DECLARE_COMPLETION(kernfs_debug_wait_completion);
> > +EXPORT_SYMBOL_NS_GPL(kernfs_debug_wait_completion, KERNFS_DEBUG_PRIVATE);
> > +
> > +void kernfs_debug_wait(void)
> > +{
> > + unsigned long timeout;
> > +
> > + timeout = wait_for_completion_timeout(&kernfs_debug_wait_completion,
> > + msecs_to_jiffies(3000));
> > + if (!timeout)
> > + pr_info("%s waiting for kernfs_debug_wait_completion timed out\n",
> > + __func__);
> > + else
> > + pr_info("%s received completion with time left on timeout %u ms\n",
> > + __func__, jiffies_to_msecs(timeout));
> > +
> > + /**
> > + * The goal is wait for an event, and *then* once we have
> > + * reached it, the other side will try to do something which
> > + * it thinks will break. So we must give it some time to do
> > + * that. The amount of time is configurable.
> > + */
> > + msleep(kernfs_config_fail.sleep_after_wait_ms);
> > + pr_info("%s ended\n", __func__);
> > +}
>
> All the uses of "__func__" here seems redundant; I would drop them.

Alright, and I also added the pr_fmt define which I forgot.

> > diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> > index 60e2a86c535e..4479c6580333 100644
> > --- a/fs/kernfs/file.c
> > +++ b/fs/kernfs/file.c
> > @@ -259,6 +259,9 @@ static ssize_t kernfs_fop_write_iter(struct kiocb *iocb, struct iov_iter *iter)
> > const struct kernfs_ops *ops;
> > char *buf;
> >
> > + if (kernfs_debug_should_wait(kernfs_fop_write_iter, at_start))
> > + kernfs_debug_wait();
>
> So this could just be:
>
> may_wait_kernfs(kernfs_fop_write_iter, at_start);

Yup! Thanks!

> > diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
> > index f9cc912c31e1..9e3abf597e2d 100644
> > --- a/fs/kernfs/kernfs-internal.h
> > +++ b/fs/kernfs/kernfs-internal.h
> > +#define __kernfs_config_wait_var(func, when) \
> > + (kernfs_config_fail. func ## _fail.wait_ ## when)
> ^^ ^ ^
> nit: needless spaces

Trimmed.

Luis