Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was:Re: Cannot format floppies under kernel 2.6.*?)

From: Mark Hounschell
Date: Tue Dec 22 2009 - 19:22:30 EST


On 12/22/2009 06:37 PM, Pallipadi, Venkatesh wrote:
> On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
>> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>>>
>>> [ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for
>>> details, but Mark is basically chasing down a situation where the floppy
>>> driver seems to have trouble formatting floppies, and it happened
>>> between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a
>>> memory block transfers the wrong value for the first byte of the block.
>>>
>>> Which should be impossible, but whatever. Some part of the system has a
>>> cached buffer that isn't flushed.
>>>
>>> What gets _you_ guys involved is that Mark cannot reproduce the bug if
>>> HPET is disabled in the BIOS or by using 'nohpet'. He found that out by
>>> pure luck while bisecting, because some time during his bisect, his
>>> machine wouldn't even boot with HPET.
>>>
>>> So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But
>>> 2.6.28 (and current -git) does not. Any ideas? ]
>>>
>>> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>>>
>>>> Ok, I may have something that might help.
>>>>
>>>> # git bisect bad
>>>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>>>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>>>> Author: venkatesh.pallipadi@xxxxxxxxx <venkatesh.pallipadi@xxxxxxxxx>
>>>> Date: Fri Sep 5 18:02:18 2008 -0700
>>>>
>>>> x86: HPET_MSI Initialise per-cpu HPET timers
>>>>
>>>> Initialize a per CPU HPET MSI timer when possible. We retain the HPET
>>>> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
>>>> setup the remaining HPET timers as per CPU MSI based timers. This per CPU
>>>> timer will eliminate the need for timer broadcasting with IRQ 0 when there
>>>> is non-functional LAPIC timer across CPU deep C-states.
>>>>
>>>> If there are more CPUs than number of available timers, CPUs that do not
>>>> find any timer to use will continue using LAPIC and IRQ 0 broadcast.
>>>>
>>>> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@xxxxxxxxx>
>>>> Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx>
>>>> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>>>>
>>>> And of coarse this was the first commit that I could not boot if I had hpet
>>>> enabled. To get this one to boot (single user mode only) I had to add the
>>>> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
>>>>
>>>> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>>>
>>>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
>>>> {
>>>>
>>>> if (request_irq(dev->irq, hpet_interrupt_handler,
>>>> - IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
>>>> + IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
>>>> return -1;
>>>>
>>>> disable_irq(dev->irq);
>>>>
>>>> AND add the quiet cmdline option.
>>>
>>> Ok, so we know why HPET didn't boot for you, and that was fixed later (by
>>> that 5ceb1a04). But is this also when the floppy started mis-behaving?
>>>
>>
>> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when the floppy stops
>> working
>> and also when I could no longer boot with hpet enabled.
>
>
> I am missing something here. Commit 26afe5f2 is where system does not
> boot with HPET or is it where the floppy stops working when you boot
> with HPET enabled.
>

As it happens, both happen there. Commit 5ceb1a04 is where it starts
booting _again_ with hpet enabled. So I took that patch (5ceb1a04) and
applied it to (26afe5f2f) to be able to boot with hpet enabled. I had to
use the quiet option to get to a login prompt, but there is where the
floppy format first fails, just as it does in 2.6.28 and up.

> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
> output in each case. With that option, we should be using local APIC
> timer and PIT, HPET or HPET with MSI should not really matter. Does it
> still fail with .28 with that option?
>

Yes, I will try that for you but will have to wait until the morning. Sorry.

Regards
Mark


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/