Re: [PATCH v2] blk: improve order of bio handling in generic_make_request()

From: Lars Ellenberg
Date: Fri Mar 10 2017 - 13:51:19 EST


On Fri, Mar 10, 2017 at 04:07:58PM +0100, Jack Wang wrote:
> On 10.03.2017 15:55, Mikulas Patocka wrote:
> > On Fri, 10 Mar 2017, Mike Snitzer wrote:
> >> On Fri, Mar 10 2017 at 7:34am -0500,
> >> Lars Ellenberg <lars.ellenberg@xxxxxxxxxx> wrote:
> >>
> >>>> --- a/block/blk-core.c
> >>>> +++ b/block/blk-core.c
> >>>> @@ -1975,7 +1975,14 @@ generic_make_request_checks(struct bio *bio)
> >>>> */
> >>>> blk_qc_t generic_make_request(struct bio *bio)
> >>>> {
> >>>> - struct bio_list bio_list_on_stack;
> >>>> + /*
> >>>> + * bio_list_on_stack[0] contains bios submitted by the current
> >>>> + * make_request_fn.
> >>>> + * bio_list_on_stack[1] contains bios that were submitted before
> >>>> + * the current make_request_fn, but that haven't been processed
> >>>> + * yet.
> >>>> + */
> >>>> + struct bio_list bio_list_on_stack[2];
> >>>> blk_qc_t ret = BLK_QC_T_NONE;
> >>>
> >>> May I suggest that, if you intend to assign something that is not a
> >>> plain &(struct bio_list), but a &(struct bio_list[2]),
> >>> you change the task member so it is renamed (current->bio_list vs
> >>> current->bio_lists, plural, is what I did last year).
> >>> Or you will break external modules, silently, and horribly (or,
> >>> rather, they won't notice, but break the kernel).
> >>> Examples of such modules would be DRBD, ZFS, quite possibly others.

> > It's better to make external modules not compile than to silently
> > introduce bugs in them. So yes, I would rename that.
> >
> > Mikulas
>
> Agree, better rename current->bio_list to current->bio_lists
>
> Regards,
> Jack

Thank you.

(I don't know if some one does, but...)
Thing is: *IF* some external thing messes with
current->bio_list in "interesting" ways, and not just the
"I don't care, one level of real recursion fixes this for me"
pattern of
struct bio_list *tmp = current->bio_list;
current->bio_list = NULL;
submit_bio()
current->bio_list = tmp;
you get a chance of stack corruption,
without even as much as a compiler warning.

Which is why I wrote https://lkml.org/lkml/2016/7/8/469
...

Instead, I suggest to distinguish between recursive calls to
generic_make_request(), and pushing back the remainder part in
blk_queue_split(), by pointing current->bio_lists to a
struct recursion_to_iteration_bio_lists {
struct bio_list recursion;
struct bio_list queue;
}

By providing each q->make_request_fn() with an empty "recursion"
bio_list, then merging any recursively submitted bios to the
head of the "queue" list, we can make the recursion-to-iteration
logic in generic_make_request() process deepest level bios first,
and "sibling" bios of the same level in "natural" order.

...

Cheers,

Lars