Re: Drives missing at boot

From: Mark Knecht
Date: Mon Aug 02 2010 - 18:07:24 EST


On Thu, Jul 22, 2010 at 5:39 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> Hello,
>
> On 07/21/2010 10:54 PM, Mark Knecht wrote:
>> Â ÂLooks like I had a failure today. First one in weeks and only the
>> 3rd or 4th boot with this newer patch file. One of the two drives
>> making a RAID0 wasn't found so /dev/md11 (constructed from /dev/sdd
>> and /dev/sde) couldn't be started. I did a cold reboot and the drive
>> was found.
>>
>> Â ÂIf it matters, and it probably doesn't, the failure came on a boot
>> which had a scheduled fsck to do of /dev/md5 - my main / drive. I
>> don't see how that would make a difference but I figure why leave the
>> info out. That's why the times are so much larger in the dmesg file.
>> (I think)
>>
>> Â Âdmesg attached. I patched the Gentoo kernel if it makes a
>> difference, same as I did with the earlier patch.
>>
>> mark@c2stable ~ $ uname -a
>> Linux c2stable 2.6.34-gentoo-r2 #1 SMP PREEMPT Sun Jul 18 14:09:48 PDT
>> 2010 x86_64 Intel(R) Core(TM) i7 CPU X 980 @ 3.33GHz GenuineIntel
>> GNU/Linux
>> mark@c2stable ~ $
>
> Hmmm... that's weird. ÂCan you please make sure the patch is actually
> applied? ÂAdding a printk("XXX patch applied!\n") near other changes
> usually is easy enough. ÂAlso, can you please apply resume-dbg-1.patch
> too and reproduce the failure and post log?
>
> Thanks.
>
> --
> tejun
>

Hi Tejun,
I'm finally home and trying to get back to this. I'm really a bad
programmer so I don't know what I've done wrong but it seems patch
isn't happy with me.

c2stable linux # patch --dry-run -p1 <../ata_piix-sidpr-lock.patch
patching file drivers/ata/ata_piix.c
patch: **** malformed patch at line 13:

c2stable linux #

Here's the change I tried to make to a copy of the file:

c2stable linux # cat ../ata_piix-sidpr-lock.patch
diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index 7409f98..3971bc0 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -158,6 +158,7 @@ struct piix_map_db {
struct piix_host_priv {
const int *map;
u32 saved_iocfg;
+ spinlock_t sidpr_lock; /* FIXME: remove once locking in EH is fixed */
+ printk("MWK - ata_sidpr patch applied!\n");
void __iomem *sidpr;
};

@@ -951,12 +952,15 @@ static int piix_sidpr_scr_read(struct ata_link *link,
unsigned int reg, u32 *val)
{
struct piix_host_priv *hpriv = link->ap->host->private_data;
+ unsigned long flags;

if (reg >= ARRAY_SIZE(piix_sidx_map))
return -EINVAL;

+ spin_lock_irqsave(&hpriv->sidpr_lock, flags);
piix_sidpr_sel(link, reg);
*val = ioread32(hpriv->sidpr + PIIX_SIDPR_DATA);
+ spin_unlock_irqrestore(&hpriv->sidpr_lock, flags);
return 0;
}

@@ -964,12 +968,15 @@ static int piix_sidpr_scr_write(struct ata_link *link,
unsigned int reg, u32 val)
{
struct piix_host_priv *hpriv = link->ap->host->private_data;
+ unsigned long flags;

if (reg >= ARRAY_SIZE(piix_sidx_map))
return -EINVAL;

+ spin_lock_irqsave(&hpriv->sidpr_lock, flags);
piix_sidpr_sel(link, reg);
iowrite32(val, hpriv->sidpr + PIIX_SIDPR_DATA);
+ spin_unlock_irqrestore(&hpriv->sidpr_lock, flags);
return 0;
}

@@ -1566,6 +1573,7 @@ static int __devinit piix_init_one(struct pci_dev *pdev,
hpriv = devm_kzalloc(dev, sizeof(*hpriv), GFP_KERNEL);
if (!hpriv)
return -ENOMEM;
+ spin_lock_init(&hpriv->sidpr_lock);

/* Save IOCFG, this will be used for cable detection, quirk
* detection and restoration on detach. This is necessary
c2stable linux #

Maybe you can shoot back something that's done correctly and I'll
start testing.

I've tried booting a few times. I've had 3 cold boot failures so
far. No warm boot failures. Each time it failed on cold boot a warm
boot fixed it.

Thanks,
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/