Actually, yes an no. I reported two separate issues last week. One was the
corruption using the HPT-366.. which apparently was a shared IRQ issue.
The other problem I was reporting... lockups with SMP post 2.2.6 with
multiple promise controllers... is the same one as this one, as far as I can
tell. I reported this with a good amount of detail in an email to lk
9-13-99 titled "IDE + SMP Lockup (no OOPS) in 2.2.12, 2.2.10". I'd be happy
to resend to anyone who wants the full report.
Note that while I can always cause this box to crash using 2.2.12 or 2.2.10
with SMP enabled, this box was (and still is) 100% stable with 2.2.6 with
ide patches with SMP enabled.
I ran some of the tests you mention, so I'll relay my results:
(For summary, my box is abit bp-6, dual c366, 128MB ram, using onboard
PIIX4, three PDC20246, 3com 3c509 ethernet. 11 disks maxtor disks)
> I know about this, but not a clue to resolve the issue.
> I have brought up the issue with Promise and they are working with me.
> There is a duplicate issue with M$'s NT, I hope to get the method of
> resolution to this problem and not "miniport" code.
>
> First test to verify the problem, build another kernel that is UP.
> Include raid and rag the hell out of the box.
> This should not fail, based upon other test reports.
Yes, This system build in UP mode with RAID is fully stable.
> If the box throttles in UP, the driver core is RAID UP Stable.
>
> Second, disable RAID under SMP and push a sequencal access on the last
> two channels.
I recompiled w/o RAID and with SMP, and pushed equal load on one disk on
each channel. And I get a hard lockup just as quickly as using RAID.
> What is the irq routing table for this board?
> Did you remove all the Promise BIOS chips execpt for the one that
> registers hde/f/g/h ?
Though I tried many configurations, I was able to duplicate the (very quick)
crashes with no IRQ overlays. The best example was with the following PCI
setup:
AGP: Trident VGA (no IRQ)
PCI 1: PDC20246 (hdm/n/o/p without BIOS)
PCI 2: PDC20246 (hdi/j/k/l without BIOS)
PCI 3: empty
PCI 4: empty
PCI 5: PDC20246 (hde/f/g/h with BIOS v1.26)
previously mentioned email contains a full bootlog.
> Have you patched with "ide.2.2.12.19990921.patch.bz2"?
> New Promise OEM support.
All of my tests were done right before this patch was released, so I haven't
yet. But read on, I have been trying to duplicate this lockup with another
box that I can use for tracking this down.
> You are required to set these for more than two cards.
> CONFIG_BLK_DEV_PDC202XX=y
> PDC202XX_FORCE_BURST_BIT=y
All my tests did have these set.
Ok, so what have I been doing since I first reported this?
The box that I reported these lockups against is basically a production box.
It's an 80 gig RAID array, so you know we must be using it for something ;)
To make things worse, it is colocated at an ISP. After I reported the
problems, I pretty much had to set it to UP mode and let it run.
However, I have two other abit BP6 based development systems, and my intent
is to try to duplicate these results with one of these. They are local to
me, and can be fucked with without worries.
They are both Tekram based SCSI system, however, so some modifications were
needed. I have temporarily claimed two 14G IBM 7200 udma33 drives for
testing, and I ordered two PDC20246's to put in to round out the picture.
So far, I haven't been able to get this configuration to crash. But I fear
it's for lack of volume running through the controllers. I hooked up one
IBM drive each to the two PDC controllers, and ran it with equal load on the
hard disks, but no crashes.
I am going to break apart the rest of my computers here in an attempt to
borrow three more IDE drives (two maxtors, and a quantum). I'm going to
hook these up so that I have a disk as hda, and one disk on each of the
promise channels. I'll then test with the vanilla kernels and with the
ide.2.2.12.19990921.patch.bz2 patch and see if I can get it to crash. My
guess is that 5 drives, working most of the channels... will be able to
produce the same results.
I'll be doing these tests in the next few days, so I will relay my results.
Tom
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/