Re: Race in RPC code

From: Jakob Oestergaard (jakob@unthought.net)
Date: Fri Feb 07 2003 - 08:44:46 EST


On Fri, Feb 07, 2003 at 02:21:50PM +0100, Trond Myklebust wrote:
> >>>>> " " == Jakob Oestergaard <jakob@unthought.net> writes:
>
>
> > We don't know whether req has been modified between the
> > assignment and the spin_lock.
>
> It had better not be. If it is, then I want to know where so that we
> can fix it.
>
> req->rq_xprt is set up when the request is initialized. It
> is not meant to change until the request gets released. This again
> should not happen while the request is still on the wait queue.
>
> IOW the fix you propose would just be papering over another problem.

Any suggestions as to how it could happen?

The box is running huge compile jobs (>100MB memory used by each
compiler - runs 2-3 compilers concurrently) all day long - we never had
a GCC sig11 error. It has 512 MB of ECC memory (and ECC is enabled) - I
seriously doubt that we have a memory corruption problem.

The panic has happened once, just today.

I will be happy to try other solutions, but I can't verify whether they
work - I mean, if the box runs another few months without crashing that
doesn't really prove anything...

Thanks for commenting!

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Feb 07 2003 - 22:00:23 EST