Re: [patch] Re: Timeout overflow in select()

Andrea Arcangeli (andrea@e-mind.com)
Thu, 26 Nov 1998 13:17:07 +0100 (CET)


On Thu, 26 Nov 1998, Stephane Belmon wrote:

>I don't find it clear at all, and I'm not sure it works. I think this kind
>of code should be made clear, since it deals with overflows by its very
>nature. I don't see what you mean there:

>+ nsec_tmp = nsec + (1000000000UL+HZ-1)/HZ - 1;
>+ if (nsec_tmp > nsec)
>+ nsec = nsec_tmp;

tv_nsec could be 0xffffffff. If you add a little value to 0xffffffff you
get 0x000000??. So this check the wraparound of the nsec.

>"A moment of consideration" shows that (nsec_tmp > nsec) is false only
>when you have an overflow in the addition in the previous line. I'm sorry,
>but if that's what you meant, that's clear as mud :-)

Yes it' s what I mean. What does it mean "clear as mud"? What I should
change to fix the problem you are reporting?

>Also, returning MAX_JIFFY from the xxx_to_jiffies() functions might be OK,
>or maybe not. What is that constant supposed to mean? We have
>MAX_SCHEDULE_TIMEOUT already, and they seem to have the same value
>(MAX_LONG). So basically, when you have an overflow, timespec_to_jiffies

I tried to include <linux/sched.h> but the compiler doesn' t allow to use
((unsigned) MAX_SCHEDULE_TIMEOUT) instead of (~0UL >> 1). The right thing
to do would be to use ((unsigned) MAX_SCHEDULE_TIMEOUT) instead of
(~0UL>>1) of course... Try and let me know...

>will return something that means "forever", which is not the same as "as
>long as you can do, but not forever". That might not be what some people

This is true but it' s the right thing to do according to me... Really we
could do two helper function, one that return MAX_JIFFY-1; and one that
returns MAX_JIFFY;.

>calling select() would like. They could prefer to be woken up, only to
>find out that it's not time yet. And go back to sleep until it _is_ time.
>
>It's not clear at all what jiffies_to_timespec() should return when passed
>a MAX_SCHEDULE_TIMEOUT; an answer coherent with the select() interface

When we pass MAX_SCHEDULE_TIMEOUT should be fine. If we pass an highter
value it' s not fine and need fixing.

>would be "a NULL *timespec", but that's not an option. There's also the
>eternal question: are jiffies signed or unsigned?

Woops it has to be unsigned I think. This because jiffies in such function
is really a delta... Thanks for the bugfix.

>(As a general note, I find MAX_SCHEDULE_TIMEOUT == LONG_MAX to be a bad
>idea. In some ways, it's good, because you can compare another timeout to
>it without a special case. But that gives a sense of "ease of use" that
>will make you do things like ... add 1 to it. Ooops. Ideally, noone would
>be looking into these structs directly but would use nice bulletproof
>macros or functions. Actually, I would even do stuff on time_t's and
>jiffies through macros/functions)
>
>There are many other similar problems in your patch.
>
>Finally, I'd say that this is pretty long. Can't some of these be
>expressed in terms of others, instead of cut&paste? It would also be less
>error-prone.

Yes this is true. I' ll try to do that soon. Right now I only fixed the
unsigned in jiffies_to_timeval.

>(b) The patch for select.c seems sound (of course it depends on the rest,
>but per-se I like it).
>
>(c) I'm not sure what's the deal about signal.c. Your patch:
>- timeout = (timespec_to_jiffies(&ts)
>- + (ts.tv_sec || ts.tv_nsec));
>+ timeout = timespec_to_jiffies(&ts);
>
>I guess substracting 1 to the timeout when it's not (0,0) was intentional
>(that code was too "smart", so good riddance anyway).

I understand the old code this way: "if the parameters are not null and
timespec_to_jiffies() would return 0, set the timeout to 1". I added this
check to time???_to_jiffies() itself:

+ retval = nsec + sec;
+ if (retval < 0)
+ return MAX_JIFFY;
+ else if (!retval)
+ return value->tv_sec || value->tv_nsec;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ return retval;

If I am misunderstooding the signal.c case let me know.

>(d) About the sched.c patch: for me this has nothing to do with the
>problem at hand. The change comment is also misleading; schedule_timeout
>was there before your patch. And what are you patching against? That's not

I implemented schedule_timeout() some weeks ago from scratch, Linus then
included it and I forget to add a line of credits in sched.c. I am not
worryed by the line of credits, but I would be heppy to see it there ;-)

>2.1.129 I think.

Are you getting rejectes? I guess no... If `patch` is complaining about
line offsets, it's because I cut out by hand some not relevant part of my
current sched.c after doing the diff.

>Finally: to get back to the original problem: you and I seem to agree that
>it's not worth looping around the do_select() in sys_select() to get huge
>and accurate timeouts, even in the face of a large HZ. I haven't received
>any answers for my "mini-poll" so far. I'll wait a couple days and I'll
>send something with an explicit subject. But in the meantime, what do you
>think?

I am not worried about the inteface because first or before
MAX_JIFFY/MAX_SCHEDULE_TIMEOUT will be higher then any kind of human
rasonable timeout. x86 uses HZ == 100 in the meantime ;-)

I' ll let you know when I' ll have done the cleanup using macros (this
evening or tomorrow).

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/