Re: ata: BUG in ata_sff_hsm_move

From: Dmitry Vyukov
Date: Fri Jan 29 2016 - 08:19:25 EST


On Fri, Jan 29, 2016 at 1:23 PM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> Hello, Dmitry.
>
> On Fri, Jan 29, 2016 at 12:59:49PM +0100, Dmitry Vyukov wrote:
>> > Hmmm... the port interrupt handler checks for IDLE before calling into
>> > hsm_move, so the only explanation would be that something is resetting
>> > it to IDLE inbetween. ce7514526742 ("libata: prevent HSM state change
>> > race between ISR and PIO") describes and fixes the same problem. The
>> > fix seems correct and I can't find anywhere else where this can
>> > happen. :(
>> >
>> > Can you please post the kernel log leading to the BUG? Also, I don't
>> > think that condition needs to be BUG. I'll change it to WARN.
>>
>> Here are two logs, in both cases no kernel messages prior to the bug:
>> https://gist.githubusercontent.com/dvyukov/5087d633e3620280b6c7/raw/31c9ab1ced92ac5f85cfb15eaf48ec5793c2c3a1/gistfile1.txt
>> https://gist.githubusercontent.com/dvyukov/825b2e3d5fb80ae08a9a/raw/03c5a4f4c4bd9d0a304a71cda2da4c92f4b7f1ba/gistfile1.txt
>
> lol, this is kinda embarrassing. It looks like the poll path wasn't
> doing any locking. Can you please verify the following patch at least
> doesn't crash the machine immediately and if so keep it applied to the
> test kernel so that we can verify that the problem actually goes away?


Great that you managed to debug it without a repro!
I've applied this patch to my tree and will rerun fuzzer. I will
notify you if I see this warning again.
Thanks

> diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
> index 608677d..6991efc 100644
> --- a/drivers/ata/libata-sff.c
> +++ b/drivers/ata/libata-sff.c
> @@ -1362,12 +1362,14 @@ static void ata_sff_pio_task(struct work_struct *work)
> u8 status;
> int poll_next;
>
> + spin_lock_irq(ap->lock);
> +
> BUG_ON(ap->sff_pio_task_link == NULL);
> /* qc can be NULL if timeout occurred */
> qc = ata_qc_from_tag(ap, link->active_tag);
> if (!qc) {
> ap->sff_pio_task_link = NULL;
> - return;
> + goto out_unlock;
> }
>
> fsm_start:
> @@ -1382,11 +1384,14 @@ static void ata_sff_pio_task(struct work_struct *work)
> */
> status = ata_sff_busy_wait(ap, ATA_BUSY, 5);
> if (status & ATA_BUSY) {
> + spin_unlock_irq(ap->lock);
> ata_msleep(ap, 2);
> + spin_lock_irq(ap->lock);
> +
> status = ata_sff_busy_wait(ap, ATA_BUSY, 10);
> if (status & ATA_BUSY) {
> ata_sff_queue_pio_task(link, ATA_SHORT_PAUSE);
> - return;
> + goto out_unlock;
> }
> }
>
> @@ -1403,6 +1408,8 @@ static void ata_sff_pio_task(struct work_struct *work)
> */
> if (poll_next)
> goto fsm_start;
> +out_unlock:
> + spin_unlock_irq(ap->lock);
> }
>
> /**
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@xxxxxxxxxxxxxxxxx
> For more options, visit https://groups.google.com/d/optout.