Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2

From: Christophe Saout
Date: Sun Mar 27 2005 - 13:10:03 EST


Am Sonntag, den 27.03.2005, 19:26 +0200 schrieb Andi Kleen:

> > preempt_schedule_irq is not an i386 specific function and seems to take
> > special care of BKL preemption and since reiserfs does use the BKL to do
> > certain things I think this actually might be the problem...?
>
> Hmm, preempt_schedule_irq is not in mainline as far as I can see.
> My patches are always for mainline; i dont do a special
> patch kit for -mm*

PREEMPT_BKL has been in mainline since 2.6.11-rc1, preempt_schedule_irq
made it in 2.6.11-rc3. Please look here:
http://linux.bkbits.net:8080/linux-2.6/search/?expr=preempt_schedule_irq&search=ChangeSet+comments

For i386 the first change was to switch to preempt_schedule in this code
path: http://linux.bkbits.net:8080/linux-2.6/patch@xxxxxxxxxxxx

preempt_schedule takes care of setting PREEMPT_ACTIVE and resetting it
afterwards, so I removed that from the assembler code.

Then preempt_schedule_irq has been introduced to move the sti/cli back
around the call to schedule:
http://linux.bkbits.net:8080/linux-2.6/patch@xxxxxxxxxxxx

So in the end the only thing that the patch I proposed was doing is to
*additionally* handle the PREEMPT_BKL case so that schedule doesn't
accidentally release the BKL semaphore when it shouldn't because we are
preempting and nobody explicitly called schedule.

Several other archs have done the same. No bug has shown up until the
recent -mm kernel where the execution of this code path actually became
possible (the "jc -> jnc" fix some lines above).

> It looks like a unfortunate interaction with some other patches
> in mm. Andrew, can you disable CONFIG_PREEMPT on x86-64 in
> mm for now?

These things are in 2.6.11 (except that they never got called because of
the wrong interrupt flag check in the IRQ handler).

> > Unfortunately I don't have a amd64 machine to play with, so can somebody
> > please check this?
>
> How did you generate the crash dumps above then?

Well, nobody minds if I play with a webserver in the middle of the
night, as long as it works during the day. Shoot me. :)

Both servers are running fine since I applied my patch last night.

Now that I looked into it I think that it's obviously the correct
solution.

Attachment: signature.asc
Description: This is a digitally signed message part