Re: [patch] SMP race fix [was Re: SMP lockup & 3c509 on 2.2.x [aka. the Deadly 'ping -f']]

=?iso-8859-1?Q?Taneli_V=E4h=E4kangas?= (taneli@firmament.fi)
Thu, 6 May 1999 00:03:08 +0300 (EEST)


Hello, all!

On Wed, 5 May 1999, Andrea Arcangeli wrote:

[lotsa snipped]
> To convince still more myself and you now that I have a bit more of time
> I'll try write a little SMP trace of what I think it's happening:
>
> start with 0 irq running
>
> CPU0 CPU1
> ---------- -------------
> edge_io_apic_IRQ
> set inprogress bit
> handle_irq_event()
>
> edge_io_apic_IRQ
> set pending bit
> return nested irq
>
> disable_irq()
> spin_lock(controller)
> set disabled bit
> handle_irq_event returns
> spin_lock(controller)
> spin_unlock(controller)
> ah inprogress is set so do a
> synchronize_irq() nono ;)
> global_irq_count == 0 so return
> here the code think that no irqN
> are running on the other CPUs
> got the controller lock
> ah but pending is still true
> so unlock(controller) and then
> issue a new handle_irq_event()
> spin_lock(eth0)
> spin_lock(eth0)
> finished critical section
> spin_unlock(eth0)
> go ahead in critical section
> enable_irq(eth0-irq)
> clear inprogress and disabled bits
> so allow other irqs to run
> edge_io_apic_IRQ
> disabled and inprogress was clear
> go ahead and set inprogress
> handle_irq_event
> spin_lock(eth0) -> deadlock

This is how I figured it, too. I'm not sure if the patch does work (didn't
read yet), but the above scenario looks exactly the same I thought ...

If the patches look ok, I'll try them the next thing tomorrow morning
(can't try pinging the other host is in my little brother's room and he's
sleeping with computer turned off ;)

BTW, I think Linus meant that your patch won't work if an interrupt
handler calls disable_irq(its_own_irq). But as I said I didn't read it
yet ...

> Still make sense to me. Subtle I know, but so it's been far more fun ;).
>
> Maybe I am dreaming...

I don't think you are.

Taneli <taneli@firmament.fi>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/