Re: Memory leaking behaviour in 2.6.20.11, reiserfs related?

From: Rob Mueller
Date: Mon Aug 06 2007 - 00:27:25 EST


This is pretty much a vanilla kernel, with just one patch to work
around a deadlock problem in the reiserfs_file_write code that I
think isn't fixed.

http://lists.linuxcoding.com/kernel/2006-q1/msg32508.html

So that sounds like a reiserfs bug.

Yes, and it was definitely there still in 2.6.16. I don't know if it's been fixed since or not, haven't heard anything more from anyone on it.

BUG: at fs/reiserfs/inode.c:2868 reiserfs_releasepage()
[<c0199ecb>] reiserfs_releasepage+0xa3/0xa8
...
[<c010383b>] kernel_thread_helper+0x7/0x1c
=======================

And so does that.

These messages are "new", in that I think we've only seen them since upgrading to 2.6.20, they weren't in 2.6.16. They do seem like a reiserfs bug, but haven't seen any confirmation of that either. I thought they might be related to the leak behaviour.

Re: 2.6.22-rc6-mm1 + leak patches

I haven't had a chance to test these. Given that you're not sure they'll even be helpful, is there something else we can test first before going down this path?

Quite a few people are using reiserfs and yours is the only report of this
which I can recall. Can you think of any reason why your setup differs
from most other people's?

No, not really, that 1 patch I mentioned previously is the only difference to a vanilla kernel.

Some other things that are interesting/strange

1. After rebooting that machine, I found it in the same memory leaked state 17 hours later, so it doesn't even take a day to leak all that memory.
2. Although it ended up in the same leaked state after just 17 hours, even a week later, the machine is still running fine. It seems to reach a "steady state" where it has lots of leaked memory, but it doesn't cause the machine to swap or do anything particularly crazy, it just sits in that state.
3. We use the exact same kernel on some Prescott Xeon based machines with 8G of memory, and they don't display the same problem at all. The problem only seems to be occuring on our newer 12G Woodcrest Xeon based machines. For example.

[root@imap8 ~]$ uname -a
Linux imap8 2.6.20.11-reiserfix-fai #1 SMP Wed May 23 09:40:20 UTC 2007 i686 GNU/Linux
[root@imap8 ~]$ cat /proc/cpuinfo | grep 'model name'
model name : Intel(R) Xeon(TM) CPU 3.00GHz
model name : Intel(R) Xeon(TM) CPU 3.00GHz
[root@imap8 ~]$ free
total used free shared buffers cached
Mem: 8308908 8034260 274648 0 493724 2008128
-/+ buffers/cache: 5532408 2776500
Swap: 2048276 61820 1986456
[root@imap8 ~]$ ps auxw | wc -l
1538

[root@imap9 ~]$ uname -a
Linux imap9 2.6.20.11-reiserfix-fai #1 SMP Thu May 10 01:57:03 UTC 2007 i686 GNU/Linux
[root@imap9 ~]$ cat /proc/cpuinfo | grep 'model name'
model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
[root@imap9 ~]$ free
total used free shared buffers cached
Mem: 12466848 12419764 47084 0 463564 1550232
-/+ buffers/cache: 10405968 2060880
Swap: 2048276 69828 1978448
[root@imap9 ~]$ ps auxw | wc -l
1523

Actually, maybe the other machines are displaying the same problem, I just wasn't as aware of it because it doesn't actually make the machine seem to do anything crazy. I guess I realised that there was definitely a problem with the new machines, because they were using a similar number and mix of processes to the other machines, but seemed to be using twice as much memory!

Rob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/