Re: [PATCH] kernel/exit.c: call read_unlock() when failure occursafter already called read_lock() in do_wait().

From: Chen Gang
Date: Sat Oct 05 2013 - 12:49:22 EST



Firstly, thank you for your so much reply (especially spending your much
expensive time resources).

But it seems I am too hurry: I have sent 2 patches to you (I start
finding them after supper).


My details reply is at the bottom of this mail, please check, thanks.

On 10/05/2013 11:48 PM, Al Viro wrote:
> On Sat, Oct 05, 2013 at 03:33:40PM +0800, Chen Gang wrote:
>
>> Oh, it is my fault, this is incorrect patch. Hmm... I realize a mistake
>> of me: I have said "when finding issues, I need consider about LTP in q4
>> 2013, need let it can be tested by LTP".
>
> Not really. The mistake was different.
>
>> And you feel "this is getting too frequent...", can you provide my
>> failure/succeed ratio?
>>
>> Or for a short proof: next, I will try to find 2 patches by reading code
>> within "./kernel" sub-directory, if all of them are incorrect, I will
>> *never* send patches again by reading code. Is it OK?
>
> Hell, no. It is exactly the wrong outcome. Reading code *does* catch
> bugs and assuming that testing would've found all of them is absolutely
> wrong. Let me describe what I would've done when running into a place
> like that (and yes, it's certainly calling for investigation - the
> fact that you've noticed it is a Good Thing(tm)).
>
> OK, so we have a loop entered with rwlock held, with read_unlock() done
> on a normal exit from the loop and several exits via goto not doing
> read_unlock(). That certainly deserves a closer look - if the functions
> called there are locking-neutral, we are missing read_unlock on these
> exits. Might be a bug. So far, so good. OTOH, the functions called
> there might do read_unlock() themselves, so what we need is to find
> out if it really can happen.
>
> The loop in question is
> do {
> retval = do_wait_thread(wo, tsk);
> if (retval)
> goto end;
>
> retval = ptrace_do_wait(wo, tsk);
> if (retval)
> goto end;
>
> if (wo->wo_flags & __WNOTHREAD)
> break;
> } while_each_thread(current, tsk);
>
> If we can show a scenario with goto end; happening without read_unlock,
> we are done. How would that happen? do_wait_thread() or ptrace_do_wait()
> returning non-zero with tasklist_lock held. Let's look at them:
>
> /*
> * Do the work of do_wait() for one thread in the group, @tsk.
> *
> * -ECHILD should be in ->notask_error before the first call.
> * Returns nonzero for a final return, when we have unlocked tasklist_lock.
> * Returns zero if the search for a child should continue; then
> * ->notask_error is 0 if there were any eligible children,
> * or another error from security_task_wait(), or still -ECHILD.
> */
> static int do_wait_thread(struct wait_opts *wo, struct task_struct *tsk)
> {
> struct task_struct *p;
>
> list_for_each_entry(p, &tsk->children, sibling) {
> int ret = wait_consider_task(wo, 0, p);
> if (ret)
> return ret;
> }
>
> return 0;
> }
>
> Hmm... it calls wait_consider_task() a bunch of times, returns the first
> non-zero it gets or returns 0 if all of them have done so. So the same
> question arises for wait_consider_task() - when can *that* return non-zero
> with tasklist_lock still held? The comment above it says "returns nonzero
> for a final return when we have unlocked tasklist_lock", which would seem
> to imply that nonzero returns are supposed to happen with tasklist_lock
> dropped; still, the comment might not match the code, so we need to dig
> on. The second function (ptrace_do_wait()) is just like that, only with
> wait_consider_task() getting 1 as the second argument instead of 0.
>
> Next target: wait_consider_task(). Any place where it returns non-zero
> with lock held would be a bug; so would any place where it returns zero
> without a lock. What returns do we have there? Well, there's a bunch
> of explicit return 0. There's also
> if (!ret)
> return ret;
> in the beginning, which is also returning 0. On three exits we have
> something different returned:
> 1) return wait_task_zombie(wo, p);
> 2) ret = wait_task_stopped(wo, ptrace, p);
> if (ret)
> return ret;
> 3)
> return wait_task_continued(wo, p);
>
> So we have 3 targets now - wait_task_{zombie,stopped,continued} (plus
> verifying that everything else called there is locking-neutral, but that
> can wait).
>
> wait_task_zombie() has a bunch of return 0,
> return wait_noreap_copyout(wo, p, pid, uid, why, status); (that's another
> function to review) *and* this:
> /*
> * Now we are sure this task is interesting, and no other
> * thread can reap it because we set its state to EXIT_DEAD.
> */
> read_unlock(&tasklist_lock);
> no manipulations of tasklist_lock on the path to that, but here we definitely
> unlock it. After that point we do a bunch of put_user() (which is obviously
> a good reason for read_unlock()) and return retval. Without any attempts
> to relock. Hmm... Can retval be zero here? The last place touching it
> is
> if (!retval)
> retval = pid;
> and if it's really a PID, this code is guaranteed to end up with non-zero
> retval (PIDs can't be zero). Looking for a place where pid had been
> calculated seems to indicate that a PID it is - one belonging to p.
> OK, so we still hadn't found that scenario yet - at least on this path
> the thing unlock and returns non-zero. Next target: wait_noreap_copyout().
>
> On the path to it we have this:
> read_unlock(&tasklist_lock);
> if ((exit_code & 0x7f) == 0) {
> why = CLD_EXITED;
> status = exit_code >> 8;
> } else {
> why = (exit_code & 0x80) ? CLD_DUMPED : CLD_KILLED;
> status = exit_code & 0x7f;
> }
> return wait_noreap_copyout(wo, p, pid, uid, why, status);
> so again tasklist_lock is dropped and we'd better check if there's a way
> for wait_noreap_copyout() to return 0 - that would be another locking
> problem. A look at that function shows that
> * it does a bunch of put_user()
> * it ends with the same if (!retval) retval = pid;, so no zero
> returns from it
> * it seems to be locking-neutral
> So we seem to be OK on that path as well.
>
> No other manipulations of tasklist_lock is seen in wait_task_zombie(),
> so provided that the rest of stuff called there is locking-neutral, we
> ought to be OK.
>
> Now what? wait_task_stopped() is the next unreviewed here... A big
> comment in front of that one (we are in the core kernel, but don't
> count on having those for everything and *definitely* don't count on
> having those elsewhere). Says, among other things,
> * CONTEXT:
> * read_lock(&tasklist_lock), which is released if return value is
> * non-zero. Also, grabs and releases @p->sighand->siglock.
> *
> * RETURNS:
> * 0 if wait condition didn't exist and search for other wait conditions
> * should continue. Non-zero return, -errno on failure and @p's pid on
> * success, implies that tasklist_lock is released and wait condition
> * search should terminate.
>
> So it looks like the intended rules are "return with lock held all along
> if you are returning zero, return with lock dropped otherwise". Again,
> that doesn't spare us a read through the code - the comments might not
> match it. Again, we have a bunch of return 0, and then this:
> pid = task_pid_vnr(p);
> why = ptrace ? CLD_TRAPPED : CLD_STOPPED;
> read_unlock(&tasklist_lock);
> tasklist_lock not touched prior to that point. Then either return
> wait_noreap_copyout() (we'd already read that one, locking-neutral,
> never returns 0) or a bunch of put_user(), followed by
> if (!retval)
> retval = pid;
> ..
> BUG_ON(!retval);
> return retval;
>
> OK, modulo locking neutrality of the other stuff called there, we are done
> with that case as well.
>
> Next one? Aha, wait_task_continued(). Smaller that wait_task_stopped(),
> the same picture (with slightly less ornate comment along the same lines
> in front ;-) Again, the comments might not match the current code or be
> there, so we still need to RTFS, but it's small and follows the same
> layout.
>
> Note that at *any* point during the above we might have found a bug.
> Moreover, even though it's a core kernel, I honestly wasn't sure we
> _won't_ find one. We hadn't, but it was a useful reading, all the same.
>
> Next question is "could these functions have more usual calling conventions?"
> put_user(), of course, is blocking, so we can't hold an rwlock over that.
> So if we do those put_user() there, we have to drop it. Would it make
> sense to retake it on such exits? Probably not - we'd need to branch out
> of that loop in those cases anyway, and the first thing done there would
> be read_unlock() as in your patch. Keeping the locking uniform makes
> sense when codepaths converge... A more interesting question is whether
> we need those put_user() to be there. We fills siginfo at wo->wo_infop,
> an integer at wo->wo_status and rusage at wo->wo_rusage. We might've
> filled that information in kernelspace objects and copied it to userland
> after return to do_wait(). That might be worth looking into, but it's
> a separate story.
>
> The functions involved definitely deserve comments about the calling
> conventions being unusual wrt tasklist_lock. They've got such comments,
> though, and I'm not sure if I could improve those. Might be worth
> a couple of inline comments inside the do_wait() loop...
>
> Note that if a bug had turned out to be real, commit message would definitely
> needed either "... and the function called there do not touch tasklist_lock"
> or "... and in the following case we get non-zero value returned with
> tasklist_lock still held". Said that, once we'd found paths returning
> non-zero with tasklist_lock dropped, the fix would pretty much have to be
> adding a read_unlock() on the path missing it, because there's no way
> for caller to tell which of these paths had been taken.
>
> And yes, the above is fairly typical code review work; one has to do that
> kind of things on a regular basis in unfamiliar code without useful (or
> truthful) comments, with real bugs getting caught in process. _All_ of
> the above had been by reading the code, BTW. Being able to spot suspicious
> places is a crucial part of such work, but you need to follow with reasoning
> about the code you've spotted.
>
> We all had been there, complete with much more embarrassing "I've found that
> really obvious bug; who the hell writes that kind of idiotic code?" postings,
> followed by "oops, I've completely misread it" kind of retractions. Not fun,
> especially for those of us who tend to make, er, pointed comments when
> dealing with stupid bugs. Yours was comparatively benign; such things
> will happen now and then. It's inevitable when one does read the code looking
> for bugs, and we really, really need people doing that. I can't stress that
> hard enough - we need more people doing code review. Moreover, IMO that's
> the best way to learn the kernel. I'm probably biased in that, since this
> is how I begin working with unfamiliar areas in it (as the matter of fact,
> that's how I started with Linux - RTFS through VFS, learning and looking
> for bugs at the same time). But we definitely need more people doing that
> kind of stuff.
>

In fact, I have an internal fault which did not tell outside (it's an
attitude issue):

If send a patch only by reading code, after finish making patch, I
should leave it alone in 15 -- 30 minutes, and later, I will check it
again to evaluate whether can be sent.

But for this patch, I did not delay 15 minutes (not check again). Until
after your first reply, I realized I really need check it again, and
know about it is an incorrect patch.


> The thing you really don't want to happen would be perception that you
> just look for local patterns and stop at that. That's what I was refering
> to - ISTR several threads where you seemed to go "this code looks suspicious,
> let's fix it" without bothering to look at the stuff it calls, etc.
>

Hmm... I really like reviewing code with patterns, but not only with
local patterns. In fact, I feel the original implementation is also a
standard 'pattern' for a little complex lock using (not quite complex).


> Again, spotting patterns is a very good thing - you can't do code review
> without being able to do that. However, it must be followed either with
> reasoning about the code involved or with searching for somebody familiar
> with that thing and asking to explain. The latter might very well end up
> with nobody showing up, in which case you have to do analysis yourself
> anyway ;-/
>
>

Thank you for your encouraging (especially spent your much expensive
time resources).

At least, before send patch, I should check them again after 15 minutes.


And I am just learning test tools (LTP), and next, I will try to read
the kernel code which can be tested by test tools (LTP).


Thanks.
--
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/