Re: more info on dcache corruption in 2.1.76 [now 2.1.78]

Ben Woodard (bwoodard@cisco.com)
Thu, 08 Jan 1998 08:39:13 -0800


I have upgraded to 2.1.78 and the same problem with the dcache
becoming corrupt is occuring. However, a pattern may be beginning to
emerge. It is kind of interesting it is as though I was looking in the
wrong direction all this time. I was looking at the programs that I
was running and what they were doing just before the problem in the
dcache appeared. But every time, what was going on with the computer
was completely different. What I didn't realize was that it was
happening about the same time each day or there abouts and I am
beginning to think that this is the key to the problem and why I
manifest it so regularly.

See each day, I get up go to work and while at work, I am really in
multitask mode. I am doing a dozen things at once. I have lots of
applications open and lots of windows on my computer doing
things. Then I take my computer home and read email and mess around
with things. Completely different enviornment. I have very few windows
open and I am focusing on basically one thing for a long period of
time. It is during these times that the dcache becomes corrupt. It has
never happened while I am at work and my computer is under load but
happens to me almost every night after I have simplified my
environment quite a bit.

The thing that occured to me is my virtual memory usage while at work
is signifcantly higher. Now if I understand it correctly, linux uses
all available memory for its purposes and so competes with
applications for memory. Now at work the kernel is probably on the
loosing side and the size of the caches such as the dcache are
probably kind of small. However, when I go home I close down a huge
number of windows and a bunch of applications and so the pool of
memory available for things like the dcache expands. It occurs to me
that might be where the problem is, in the code that grows the dcache.

My regular fluctuations in memory probably are what is making the bug
appear so regularly and if as I suspect the problem is tied to the
size of kernel mem available for the dcache that would also explain
the time lag between when I shut down all my apps when leaving work
and the time that the problem in the dcache appears.

It appears to me that there are several places in the dcache code
where memory usage information is printk'd but they are all bracketed
by #ifdef DCACHE_DEBUG's. I believe that I am going to define that see
if I can see any pattern in the usage of dcache memory and this
bug. If you can think of any other places where I should put printk's
let me know. I'm sorry, I am not more help. I am just beginning to
look around in the kernel and the dcache seems like a fairly
complicated structure and I have yet to fully understand it.

Hope this informatin helps,
-ben