Re: 4.16-rc2+git: pata_serverworks: hanging ata detection thread on HP DL380G3

From: Meelis Roos
Date: Fri Mar 30 2018 - 04:49:12 EST


Added CC-s, start of the thread is at
https://lkml.org/lkml/2018/2/26/165

> > > 4.16 git bootup on HP Proliant DL380 G3 pauses for a a minute or two and
> > > then continues with "blocked for more than 120 seconds" message with
> > > libata detection functions in ther stack -
> > > async_synchronize_cookie_domain() as the last. It seems to happen during
> > > IDE CD-ROM detection (detected before but registered as sr0 after the
> > > warning). After detection, the eject button on the drive did not work.
> > >
> > >
> > > pata_serverworks is the libata driver in use.
>
> There were no changes to pata_serverworks since 2014 and libata changes
> in v4.16 look obviously correct..
>
> > This is still the same in 4.16.0-rc7-00062-g0b412605ef5f.
>
> Any chance that you could bisect this issue?

Bisected to the following commit:

358f70da49d77c43f2ca11b5da584213b2add29c is the first bad commit
commit 358f70da49d77c43f2ca11b5da584213b2add29c
Author: Tejun Heo <tj@xxxxxxxxxx>
Date: Tue Jan 9 08:29:50 2018 -0800

blk-mq: make blk_abort_request() trigger timeout path

With issue/complete and timeout paths now using the generation number
and state based synchronization, blk_abort_request() is the only one
which depends on REQ_ATOM_COMPLETE for arbitrating completion.

There's no reason for blk_abort_request() to be a completely separate
path. This patch makes blk_abort_request() piggyback on the timeout
path instead of trying to terminate the request directly.

This removes the last dependency on REQ_ATOM_COMPLETE in blk-mq.

Note that this makes blk_abort_request() asynchronous - it initiates
abortion but the actual termination will happen after a short while,
even when the caller owns the request. AFAICS, SCSI and ATA should be
fine with that and I think mtip32xx and dasd should be safe but not
completely sure. It'd be great if people who know the drivers take a
look.

v2: - Add comment explaining the lack of synchronization around
->deadline update as requested by Bart.

Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Cc: Asai Thambi SP <asamymuthupa@xxxxxxxxxx>
Cc: Stefan Haberland <sth@xxxxxxxxxxxxxxxxxx>
Cc: Jan Hoeppner <hoeppner@xxxxxxxxxxxxxxxxxx>
Cc: Bart Van Assche <Bart.VanAssche@xxxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>

:040000 040000 b5c8c2fd69850021865071f9641d54ab4fd20a15 e2dbd2a15a6baeec1332cc1416e51d537ff5040a M block


--
Meelis Roos (mroos@xxxxxxxx)