On Tue, Dec 10, 2019 at 04:06:44PM -0500, Sasha Levin wrote:
From: Omar Sandoval <osandov@xxxxxx>
[ Upstream commit 9be490f1e15c34193b1aae17da58e14dd9f55a95 ]
Currently, end_workqueue_fn() frees the end_io_wq entry (which embeds
the work item) and then calls bio_endio(). This is another potential
instance of the bug in "btrfs: don't prematurely free work in
run_ordered_work()".
In particular, the endio call may depend on other work items. For
example, btrfs_end_dio_bio() can call btrfs_subio_endio_read() ->
__btrfs_correct_data_nocsum() -> dio_read_error() ->
submit_dio_repair_bio(), which submits a bio that is also completed
through a end_workqueue_fn() work item. However,
__btrfs_correct_data_nocsum() waits for the newly submitted bio to
complete, thus it depends on another work item.
This example currently usually works because we use different workqueue
helper functions for BTRFS_WQ_ENDIO_DATA and BTRFS_WQ_ENDIO_DIO_REPAIR.
However, it may deadlock with stacked filesystems and is fragile
overall. The proper fix is to free the work item at the very end of the
work function, so let's do that.
Reviewed-by: Johannes Thumshirn <jthumshirn@xxxxxxx>
Signed-off-by: Omar Sandoval <osandov@xxxxxx>
Reviewed-by: David Sterba <dsterba@xxxxxxxx>
Signed-off-by: David Sterba <dsterba@xxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
The were more patches in the series, all contain "don't prematurely free
work in" and were part of a rework of async work processing. They're
fixing a very uncommon usecase, so if there's desire to backport them
the whole series needs to go in.
In the autosel list, there are only 2 and without the important fix.
c495dcd6fbe1 btrfs: don't prematurely free work in run_ordered_work()
9be490f1e15c btrfs: don't prematurely free work in end_workqueue_fn()
e732fe95e4ca btrfs: don't prematurely free work in reada_start_machine_worker()
57d4f0b86327 btrfs: don't prematurely free work in scrub_missing_raid56_worker()
a0cac0ec961f btrfs: get rid of unique workqueue helper functions
- this is only a cleanup that removes code obsoleted by the fixes above,
probably out of scope of stable
I have intentionally not tagged the patches for stable, the usecase is
is specific to one user (FB), the known reproducer is only their
workload and the fixes are in their kernel already.
So if there's desire to add the patches to stable trees, then it has to
be the whole series, but I don't see a strong reason for it.