[patch] entry.S fix. [was: Re: scheduling problem?]

Ingo Molnar (mingo@chiara.csoma.elte.hu)
Fri, 17 Dec 1999 17:12:51 +0100 (CET)


This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
Send mail to mime@docserver.cac.washington.edu for more info.

--79888902-1964851429-945447171=:2004
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Fri, 17 Dec 1999, William Montgomery wrote:

> [...] however, there is a related
> problem which can occur. It is possible for a SCHED_OTHER task
> to be running in user mode when a rtc_interrupt occurs which
> sends a SIGIO to a SCHED_FIFO task, the SCHED_FIFO task is put
> on the run queue and the need_resched flag of the SCHED_OTHER task
> is set but the ret_with_resched is not run and the SCHED_OTHER task
> can continue for several milliseconds before schedule is run.
>
> The above scenario can also occur when setitimer is used as a
> timing source and a SIGALRM is sent. Maybe we should *always*
> run ret_with_resched upon ret_from_intr and not only when in
> supervisor mode? Any reasons this would cause problems?
> Maybe also after running bottom halfs (in the case of setitimer)?

are you sure you are describing the right scenario? I do not doubt that
you see some kind of bug, but detecting ->need_resched == 1 after
user-space has been interrupted is fairly common - this is how timer
interrupts work.

there is a bug in this area though, we do not re-run the need_resched
check _after_ running do_signal(). We always run the check even if
user-space is interrupted, but the problem is that it has to be restarted
after delivering a signal too. The preliminary fix for this against
2.3.34-pre1 is attached, does it solve your problems? [patch works here
just fine]

the fix is a bit subtle, because we must not naively rerun the
need_resched-check after do_signal(), because that can cause signal
recursion. The patch does it's own, signal-local need_resched check after
do_signal. [schedule() runs bottom halves so no need to recheck for bottom
halves.]

btw., this patch plus the hlt patch fixes the SysKonnect gigabit ethernet
latency fluctuation i've noticed a couple of days ago - now latency over a
real gigabit network is stable at 70 microseconds ... [it was fluctuating
between 200 and 300 usecs before]

Ingo

--79888902-1964851429-945447171=:2004
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="signal-2.3.34-A1"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.10.9912171712510.2004@chiara.csoma.elte.hu>
Content-Description:
Content-Disposition: attachment; filename="signal-2.3.34-A1"

LS0tIGxpbnV4L2FyY2gvaTM4Ni9rZXJuZWwvZW50cnkuUy5vcmlnCUZyaSBE
ZWMgMTcgMDc6NDg6MjQgMTk5OQ0KKysrIGxpbnV4L2FyY2gvaTM4Ni9rZXJu
ZWwvZW50cnkuUwlGcmkgRGVjIDE3IDA3OjU5OjQ1IDE5OTkNCkBAIC0yMjIs
NyArMjIyLDE0IEBADQogCWpuZSB2ODZfc2lnbmFsX3JldHVybg0KIAl4b3Js
ICVlZHgsJWVkeA0KIAljYWxsIFNZTUJPTF9OQU1FKGRvX3NpZ25hbCkNCi0J
am1wIHJlc3RvcmVfYWxsDQorc2lnbmFsX3Jlc2NoZWQ6DQorCWNtcGwgJDAs
bmVlZF9yZXNjaGVkKCVlYngpDQorCWplIHJlc3RvcmVfYWxsDQorCS8qDQor
CSAqIGRvIG5vdCByZWN1cnNlIHNpZ25hbCBoYW5kbGVycy4gVGhpcyBpcyB0
aGUgc2xvdyBwYXRoLg0KKwkgKi8NCisJY2FsbCBTWU1CT0xfTkFNRShzY2hl
ZHVsZSkNCisJam1wIHNpZ25hbF9yZXNjaGVkDQogDQogCUFMSUdODQogdjg2
X3NpZ25hbF9yZXR1cm46DQpAQCAtMjMwLDcgKzIzNyw3IEBADQogCW1vdmwg
JWVheCwlZXNwDQogCXhvcmwgJWVkeCwlZWR4DQogCWNhbGwgU1lNQk9MX05B
TUUoZG9fc2lnbmFsKQ0KLQlqbXAgcmVzdG9yZV9hbGwNCisJam1wIHNpZ25h
bF9yZXNjaGVkDQogDQogCUFMSUdODQogdHJhY2VzeXM6DQo=
--79888902-1964851429-945447171=:2004--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/