Re: [PATCH 1/7] Freezer: Read PF_BORROWED_MM in a nonracy way

From: Rafael J. Wysocki
Date: Sat May 12 2007 - 06:40:57 EST


On Saturday, 12 May 2007 11:01, Rafael J. Wysocki wrote:
> On Saturday, 12 May 2007 03:24, Linus Torvalds wrote:
> >
> > On Sat, 12 May 2007, Oleg Nesterov wrote:
> > >
> > > However, in my opininon THAT PATCH has nothing to do with this problem.
> > > It just improves the code that we already have.
> >
> > Sure.
> >
> > However, I think it does it THE WRONG WAY, and doesn't actually fix the
> > much deeper problems with the freezer, as shown by the fact that the lock
> > is *still* broken for other cases.
>
> The other cases don't lead to the specific issue this patch is meant to
> prevent. Namely, that if a kernel thread is identified as a user space task by
> the freezer it may be frozen prematurely.
>
> And yes, this only is a problem because we freeze kernel threads, which may be
> avoidable, but we've been doing that for more than two years and it's a big
> change to stop doing so. We can't just say overnight that we won't be freezing
> kernel threads from now on without al least checking if that doesn't lead to
> user-visible problems.
>
> Thus IMO it's reasonable to fix the potential issue with the current code and
> think about changing the design *later*. Still, I'm not so attached to this
> patch and if you think that it should be dropped (which IMO is wrong), then so
> be it.

Sorry, I was wrong, because ...

> That said:
>
> > So, here's a summary:
> >
> > - we should not take the lock inside the function, because taking it
> > there is fundamentally wrong, and leaves all the *other* races in
> > place.
>
> The other races don't lead to the same (wrong) result.
>
> > - if you actually want to solve the other races, the lock needs to be
> > taken by the caller, in which case taking it in the callee is obviously
> > (again) wrong.
> >
> > - or then, we accept that the race wasn't fixed AT ALL, and you add other
> > code to _other_ places to handle the case where you froze the wrong
> > thread (or didn't freeze the right one).
>
> There are no other cases, AFAICS, in which we can freeze a wrong thead.

... user space tasks that call deamonize() can also be frozen prematurely.
We didn't take this possibility into consideration before, which was obviously
wrong.

Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/