Re: BLKSECDISCARD ioctl and hung tasks

From: Salman Qazi
Date: Wed Feb 19 2020 - 17:26:44 EST


On Wed, Feb 19, 2020 at 2:23 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote:
>
> On Wed, Feb 19, 2020 at 09:54:31AM -0800, Salman Qazi wrote:
> > On Tue, Feb 18, 2020 at 6:55 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Feb 18, 2020 at 08:11:53AM -0800, Jesse Barnes wrote:
> > > > On Fri, Feb 14, 2020 at 7:47 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote:
> > > > > What are the 'other operations'? Are they block IOs?
> > > > >
> > > > > If yes, that is why I suggest to fix submit_bio_wait(), which should cover
> > > > > most of sync bio submission.
> > > > >
> > > > > Anyway, the fix is simple & generic enough, I'd plan to post a formal
> > > > > patch if no one figures out better doable approaches.
> > > >
> > > > Yeah I think any block I/O operation that occurs after the
> > > > BLKSECDISCARD is submitted will also potentially be affected by the
> > > > hung task timeouts, and I think your patch will address that. My only
> > > > concern with it is that it might hide some other I/O "hangs" that are
> > > > due to device misbehavior instead. Yes driver and device timeouts
> > > > should generally catch those, but with this in place we might miss a
> > > > few bugs.
> > > >
> > > > Given the nature of these types of storage devices though, I think
> > > > that's a minor issue and not worth blocking the patch on, given that
> > > > it should prevent a lot of false positive hang reports as Salman
> > > > demonstrated.
> > >
> > > Hello Jesse and Salman,
> > >
> > > One more question about this issue, do you enable BLK_WBT on your test
> > > kernel?
> >
> > It doesn't exist on the original 4.4-based kernel where we reproduced
> > this bug. I am curious how this interacts with this bug.
>
> blk-wbt can throttle discard request and keep discard queue not too
> deep.
>
> However, given block erase is involved in BLKSECDISCARD, I guess blk-wbt
> may not avoid this task hung issue completely.

Thanks for the explanation.

As I said, it takes one 4K BLKSECDISCARD to get to 100 second delay
where the entire device is unusable for that time. So, the
queue doesn't have to be deep at all. It's a single tiny IOCTL.

>
>
> Thanks,
> Ming
>