Re: EB164 extreme wierdness

Linus Torvalds (torvalds@cs.helsinki.fi)
Wed, 10 Apr 1996 20:52:53 +0300 (EET DST)


On Wed, 10 Apr 1996, Ian Pratt wrote:
>
> We have three EB164's which all exhibit some very strange
> behaviour. We often find that simple commands like 'uname','tar',
> 'mv', 'id', 'cp' fail sporadically, even when invoked with
> trivial or no arguments e.g. 'uname','id','tar --v', 'mv --v'.

I've actually seen this myself on my eb164, with "uname" and "touch".

I thought it was a library issue, because it has gone away for me since I
recompiled those binaries (actually, I recompiled all the shellutils etc
at the same time), and I ignored the problem.

However, the symptoms certainly _sound_ like there is something wrong in
the context invalidate code for the eb164, and maybe the reason I haven't
seen it after the recompile is just luck rather than a library issue.

> When a particular command has decided to get itself into this
> state, invoking it repeatedly will cause it to fail with a memory
> violation at the same PC. Leave the shell alone for 10 seconds
> and try it again, and the command will magically work. Repeating
> it immediately will cause it to fail at the same PC again. Once
> into this state, the behaviour is totally repeatable!

When it happened with "touch" for me, I could make the problem go away by
simply doing another command in between. However, I for other resons
suspected that it was an argument/environment problem, so I just took
that as a confirmation of my suspicion (bash will modify the environment
variable "_" for different commands, and for some strange reason I
thought that would make a difference. I probably need to have my brain
checked out some day).

> Switching shell e.g. sh to bash or vice versa can cause the
> problem to go away. While the problem is being exhibited on one
> virtual console, it can be fine on another. We do get a core when
> they fail, but the version of gdb we have fails to understand
> core files - Has anyone fixed this? Running things under gdb
> invariably makes them work fine.

That was one reason I suspected it was an environment thing: running the
thing under gdb will result in a different argument/environment setup.
Have you recompiled you shell too?

> Can anyone shed any light on this please ?

I guess I need to reconsider the implications of my earlier problems.
Looks like maybe there is something that results in incorrect TLB entries
under some circumstances. A missing invalidate somewhere that is brought
up by the ev5 ASN-marked tlb entries that can be cached across context
switches..

Linus