Re: 2.6.21-rc7-mm1 BUG at kernel/sched-clock.c:175 init_sched_clock()

From: Tejun Heo
Date: Wed Apr 25 2007 - 05:15:36 EST


Andrew Morton wrote:
>> No, that would break host probing. The port is in frozen (controller
>> initialized and IRQs masked off) state, so it's not allowed to take
>> interrupt.
>
> Sounds dodgy. What happens if the IRQ line is shared with some other
> device? We'll enter ahci_interrupt(). We've alread set up ->port_map and
> ->n_ports so we're wholly dependent upon that read from HOST_IRQ_STAT
> returning zeroes.

Shared IRQ is okay because ahci has pending IRQ register which is masked
by ->freeze(). Controllers which don't have such feature check qc
status to determine whether it's expecting IRQ, so it's okay for them
too. So, AFAIK, libata is okay with shared IRQs.

> /* sigh. 0xffffffff is a valid return from h/w */
>
> if that happens we're dead, aren't we?

Yes, we are. :-)

>> If interrupt triggers at this point, it's low level driver
>> bug. Berck, how reliably can you reproduce this problem? Can you post
>> the result of 'lspci -nn'?
>
> Berck doesn't apepar to be sharing that IRQ, so it'll be something else.

I'm pretty sure Berck's problem is caused by ->freeze() not clearing
pending IRQ bit. We haven't been too strict about ->freeze().
Considering the above 0xffffffff case, I agree it's better to fix it in
the core instead of fixing each LLD. Will brew up another patch.

Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/