Re: [PATCH 2/4] libata: Implement disk shock protection support

From: Elias Oltmanns
Date: Wed Sep 10 2008 - 17:05:45 EST


Tejun Heo <htejun@xxxxxxxxx> wrote:
> Elias Oltmanns wrote:
>>> The correct way to do this is ata_eh_about_to_do(). After that, you
>
>>> can just look at ehc->i.dev_action[]. Also, you'll need to call
>>> ata_eh_done() later.
>>
>> We have a problem here, I'm afraid, because we may keep looping in EH
>> context and still want to pick up ATA_EH_PARK requests. Imagine that
>> ATA_EH_PARK has been scheduled for device A and the EH thread has
>> reached the call to schedule_timeout_uninterruptible(). Now, ATA_EH_PARK
>> is scheduled for device B on the same port. This will wake up the EH
>> thread, but ATA_EH_PARK is only recorded in link->eh_info, not in
>> link->eh_context.i. ata_eh_about_to_do() will unconditionally clear the
>> flag in eh_info, but checking ehc->i.dev_action afterwards will only
>> tell us whether this flag was set when we entered EH, not whether it had
>> been set since.
>>
>> Should I change ata_eh_about_to_do() so that it will record the action
>> in link->eh_context before clearing it in link->eh_info?
>
> That's what ata_eh_about_to_do() currently does, exactly. Actually,
> that's the whole reason it's there as the described problem exists for
> all other actions too. :-)

Sounds reasonable enough. Much as I regret it, though, I really can't
find that this is what actually happens. Where exactly is the action
propagated from ehi to ehc->i? (Checked next-20080903, v2.6.27-rc5 and
v2.6.26).

On another matter: I don't particularly like the idea that there should
appear an "EH complete" in the logs every time a head unload request has
been processed. Is it safe to set ATA_EHI_QUIET when scheduling unload
requests or is the risk that something important may be missed too high?

>
>>> And it's probably better to have ehc->unloaded_mask instead of
>>> ehc->did_unload_mask and clear it here so that if unload is scheduled
>>> after this point but before EH completes, it does unloading again.
>>> ie. Something like the following.
>>>
>>> ata_eh_done(ATA_EH_UNLOAD);
>>> ehc->i.unloaded_mask &= ~(1 << dev->devno);
>>
>> No need for that because link->eh_context is cleared in
>> ata_scsi_error().
>
> No, for example, later steps of EH could fail in which case eh_recover
> will be retried without going out to ata_scsi_error().

Alright then.

Regards,

Elias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/