Re: Odd 2.0.35 problems with APM on VIA motherboards

Andrew Derrick Balsa (andrebalsa@altern.org)
Thu, 3 Sep 1998 14:04:02 +0200


Hi Hans,

On Thu, 03 Sep 1998, Hans wrote:
....
>> I would recommend testing 2.0.35 with Jumbo-9, but _without_ APM first.
>>
>I don't have apm support enabled in my kernel. I can try crashing it with
>hd-sleeping enabled and apm disabled in my bios.

I see. Note that AFAIK enabling power management features in the BIOS, and then
using a linux kernel without APM means you don' t get any power management at
all, except for the hard disk time-out feature which is controlled by the hard
disk embedded microprocessor itself (which is probably sent a command by the
BIOS at boot time). I would advise you to completely turn off the BIOS power
management, and setup the hard disk spin down timeout using hdparm, called from
your rc.local file.

>> In David's machine (VIA MVP3 motherboard), the problem also occurred with an
>> unpatched 2.0.35 kernel, and is certainly APM related.
>>
>> In Hans' machine (VP2 motherboard), the hang is triggered every time the hd
>> spins up, but only when UDMA is enabled (Hans, are you certain this is so? Can
>> you test 2.0.35 without Jumbo?).
>>
>If you read my first two posts you would know that:
>-I've tried 2.0.35 2.0.34 and 2.0.33 without apm kernel support,
> but with apm enabled in my bios and the crash didn't happen.
>-I've tried 2.0.35-jumbo-9 with dma2 and the crash didn't happen
>-I've tried 2.0.35-jumbo-9 and a crash happens if:
>X is running
>The hd is sleeping
>I log in on a text console
>after entering my passwd the hd spins up
>I switch to X during the spinup -> Crash
>
You are probably making too many file requests during disk spin-up. Perhaps a
queue overflows, either in the drive firmware or in the kernel code.

Important
=======
Note that _exactly_ the same identical code runs in the kernel, if you use DMA
mode 2 or UDMA. Since you only get a problem when using UDMA, the Jumbo-9 kernel
code can't be the source of the problem, so we have to look elsewhere.

>I've also had 1 hd -spinup crash without a console switch but that was
>the only one without the above circumstances.
>
>If I'm already logged in, and wait for the hd to fall asleep, then do an
>ls to wake it up and during the spinning up switch nithing happens.
>
>It only is 100% reproducable after entering my passwd
>
>I guess this is just timing related and I'm glad that it also
>happens on other systems in different ways. But on my system it only
>happens with udma (and thus jumbo 9) when switching to X
>atleast that is the only reproducable hang.

OK, I guess we are having some complex interaction between a UDMA timing and
the hard disk spin-up delay.

>> The solution could be as simple as adding an extra delay in the APM kernel
>> code to allow for correct hard disk spin up, but really I am just guessing.

That was a wrong guess from my part, too. The kernel APM code doesn't care about
hd spin down/up.
>
>> So (in principle) the problem does not seem to be Jumbo-9 related.
>>
>Hmm it is kinda since jumbo enables udma which crashes in my case.
>Then again if it also happens to other people with just apm without jumbo
>It might be that jumbo is just triggering it.

No, I am sorry but I think I mixed things up because you and David had VIA
chipset based motherboards.

David's problem _is_ APM related, but your problem is just hard disk spin
down/up + UDMA related.

If you _really_ want to find out where the problem is, the first step would be
to get a different UDMA hard disk model/brand, and test it in exactly the same
circumstances/hardware setup. Since you now have a repeatable crash, it will be
immediately obvious whether the hard disk microcontroller firmware or the kernel
code is guilty.

My present guess is a problem with the hard disk firmware, but the Quantum
Fireball series is an exceptionally reliable drive, so I guess I am wrong on
this too. Still, changing the hard disk would be a good thing to try. The next
step is changing the motherboard against a newer VIA motherboard. If the
problem disappears, then we know we have a problem in your present
motherboard/BIOS combination.

I will try to repeat exactly your crash procedure here on one of my systems,
and see what happens. (I am using IBM UDMA hard disks and SiS5598 motherboards).

Cheers,
--
Andrew D. Balsa
andrebalsa@altern.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html