[RFC v2 0/5] blktrace: fix use after free
From: Luis Chamberlain
Date: Thu Apr 09 2020 - 17:45:44 EST
This series fixes a use after free on block trace. This v2 adjusts the
commit log feedback from the first iteration, and also expands on the
series to include a few additional fixes which would be needed for us
to continue with a synchronous request_queue removal. The refcount added
for blktrace resolves the kobject issues pointed out by yukuai. Note
that CONFIG_DEBUG_KOBJECT_RELEASE purposely was added to create
situations where drivers misbehave, and so you should not use it to
test expected userspace behaviour, but just to catch possible kernel
issues. For details refer to the commit which introduced it, which
actually helps a bit more than just reading the kconfig description,
its commit was c817a67ecba7 ("kobject: delayed kobject release: help
find buggy drivers"). This series also fixes a small build issue
discovered by 0-day.
The QUEUE_FLAG_DEFER_REMOVAL flag is added as part of the last patch,
just in case for now. However, given creative use of refcounting
I don't think we need it anymore. An example use case of creative
use of refcounting is provided for mm/swapfile.
I've extended break-blktrace [0] with 3 test cases which now pass
for the most part:
run_0001.sh
run_0002.sh
run_0003.sh
The exception to this is when we get an EBUSY on loopback removal. This
only happens every now and then, and upon further investigation, I
suspect this is happening due to the same race Dave Chinner ran into
with using loopback devices and fstests, which made explicit loopback
destruction lazy via commit a1ecac3b0656 ("loop: Make explicit loop
device destruction lazy"). Further eyeballs on this are appreciated,
perhaps break-blktrace can be extended a bit to account for this.
After a bit of brushing up, I am considering just upstreaming this
as a self tests for blktrace, instead of keeping this out of tree.
Worth noting as well was that it seems odd we didn't consider the
userspace impact of commit dc9edc44de6c ("block: Fix a blk_exit_rl()
regression") merged on v4.12 moved, as that deferral work sure did
have an impact what userspace can expect upon device removal or races
on addition/removal. Its not clear if mentioning any of this on the
commit logs is worth it... Shouldn't have that deferral been a
userspace regression?
If you want this on a git tree you can find it on my 20200409-blktrace-fix-uaf
branch on kernel.org based on linux-next next-20200409.
Feedback, reviews, rants are all greatly appreciated.
[0] https://github.com/mcgrof/break-blktrace
[1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20200409-blktrace-fix-uaf
Luis Chamberlain (5):
block: move main block debugfs initialization to its own file
blktrace: fix debugfs use after free
blktrace: ref count the request_queue during ioctl
mm/swapfile: refcount block and queue before using
blkcg_schedule_throttle()
block: revert back to synchronous request_queue removal
block/Makefile | 1 +
block/blk-core.c | 9 +-------
block/blk-debugfs.c | 27 ++++++++++++++++++++++
block/blk-mq-debugfs.c | 5 -----
block/blk-sysfs.c | 43 +++++++++++++++++++++++++++++-------
block/blk.h | 17 ++++++++++++++
include/linux/blkdev.h | 7 +++++-
include/linux/blktrace_api.h | 1 -
kernel/trace/blktrace.c | 25 ++++++++++++---------
mm/swapfile.c | 11 +++++++++
10 files changed, 112 insertions(+), 34 deletions(-)
create mode 100644 block/blk-debugfs.c
--
2.25.1