Re: [patch] change futex_wait() to hrtimers

From: Nick Piggin
Date: Mon Mar 12 2007 - 07:52:25 EST


On Mon, Mar 12, 2007 at 12:38:29PM +0100, Ingo Molnar wrote:
>
> * Nick Piggin <npiggin@xxxxxxx> wrote:
>
> > > the issue is this: your fix reduces the effects of the bug but it is
> > > still fundamentally incomplete because of the use of timer_list. So
> >
> > But using schedule_timeout is not a bug. Userspace timeouts are always
> > defined to be "at least".
>
> but what you are adding isnt a plain schedule_timeout(), it is a restart
> block handling loop. And for those restart blocks that relate to
> timeouts, we only use hrtimers. I am not making this up to annoy you:
> take a look at all the current restart block handlers - they are hrtimer
> based, for exactly this reason.

So why do you say it is fundamentally incomplete?

> > > instead of trying to fix the bug the wrong way, please try to fix it
> > > the right way, ontop of an already existing and tested patch, ok?
> > > That also enables the other neat stuff Thomas talked about.
> >
> > Well that's nice, but I have a bugfix here which probably needs to get
> > backported to stable kernels and distro kernels.
>
> yes but your patch already exists for them which they can pick up.
>
> really, this is a common Linux principle: fix it completely and fix it
> the right way. You are applying it yourself on a daily basis when having
> the maintainer hat on =B-)

I still didn't get anything wrong pointed out with the patch, though.

I'm not arguing against using hrtimers here to fix it the "right way".

> > It should be just as easy to rebase the hrtimer patch on top of my
> > fix. Considering that you've had it for a year, I don't think it needs
> > to be added right before my fix.
>
> your latest patch looks quite kludgy, exactly due to the issues that
> were mentioned.

I don't see what is kludgy, unless you consider converting to and from
absolute timeouts. But the userspace API is relative time based, so
hrtimers doesn't change that.

> > > hm. I'm wondering how this wasnt noticed sooner - this futex_wait
> > > behavior has been there for like forever.
> >
> > People ignore LTP test failures, and programs probably try to avoid
> > exercising the nuances of the unix signal API, I guess.
>
> then there's no rush and lets do this the right way, ok?

There is no rush to use hrtimers. I would have thought it fairly important
to actually reach correctness, though. We're not talking about completely
changing the design of something such that it will take a lot of work to
"fix it properly". If that were the issue, then I would consider the
hrtimer conversion as part of the fix.

And if you talk about doing it the right way, then I don't think it is
strictly the right way to reimplement the function, including known bugs,
to be slightly more efficient, and *then* fixing those bugs. I'd actually
consider it better to fix the bugs first, not only because of the backport
issue, but because it generally makes it easier to track the injection and
removal points of bugs in the history.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/