Re: block: flush queued bios when the process blocks

From: Mikulas Patocka
Date: Tue Oct 06 2015 - 10:10:40 EST




On Tue, 6 Oct 2015, Mike Snitzer wrote:

> On Tue, Oct 06 2015 at 9:28am -0400,
> Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
>
> >
> >
> > On Mon, 5 Oct 2015, Mike Snitzer wrote:
> >
> > > Mikulas,
> > >
> > > Could it be that cond_resched() wasn't unplugging? As was
> > > recently raised in this thread: https://lkml.org/lkml/2015/9/18/378
> > > Chris Mason's patch from that thread fixed this issue... I _think_ Linus
> > > has since committed Chris' work but I haven't kept my finger on the
> > > pulse of that issue.
> >
> > I think it doesn't matter (regarding correctness) if cond_reched unplugs
> > on not. If it didn't unplug, the process will be scheduled later, and it
> > will eventually reach the point where it unplugs.
>
> Couldn't the original deadlock you fixed (as described in your first
> patch) manifest when a new process is scheduled?
>
> > > FYI, I've put rebased versions of your 2 patches in my wip branch, see:
> > > http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=wip
> > >
> > > I tweaked the 2nd patch that adds bio_list to plug so that
> > > generic_make_request's checks for in_generic_make_request isn't racey
> > > (your original patch could happen to have current-plug set but
> > > in_generic_make_request not yet set).
> >
> > I don't recommend that second patch
> > (http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/commit/?h=wip&id=5e740c2e45a767d8d6ef8ca36b0db705ef6259c4).
> > The patch just complicates things without adding any value. It's also not
> > correct because it plugs bios at places when bios aren't supposed to be
> > plugged
>
> Avoiding another hook the the scheduler is a requirement (from Jens).
> "without adding any value": it offers a different strategy for recording
> bios to the bio_list by making it part of the plug. The plug doesn't
> actually block bios like requests are plugged.

Unfortunatelly, it does (when the second patch is applied).

> What am I missing?

Bios allocated with bio_kmalloc can't be unplugged because they lack the
rescurer workqueue (they shouldn't be used by the stacking drivers, they
could only be used by top-level mm code). If you plug those bios at
incorrect places (i.e. between blk_start_plug and blk_finish_plug), you
are introducing other deadlock possibility.


> Avoiding another hook the the scheduler is a requirement (from Jens).

So, you can add a flag that is set when either current->plug or
current->bio_list is non-NULL and if the flag is set, run a function that
unplugs both current->plug and current->bio_list.

But anyway - if you look at struct task_struct, you see that "bio_list"
and "plug" fields are next to each other - so they will likely be in the
same cacheline. So testing current->bio_list in the schedule function has
no performance overhead because the cacheline was already loaded when
testing current->plug.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/