On Tue, Oct 11, 2011 at 03:34:48PM +0200, Christoph Hellwig wrote:A sidenote that Anders forgot.. the system was stable for very long time,This is core VM code, and operates purely on on-stack variables exceptUnfortunately not, as it is a production server. Pulling it out to memtest 256G
for the page cache radix tree nodes / pages. So this either could be a
core VM bug that no one has noticed yet, or memory corruption. Can you
run memtest86 on the box?
properly would take too long. But it seems unlikely to me that it should be
memory corruption. The machine has been running with the same (ecc) memory for
more than a year and neither the service processor nor the kernel (according to
dmesg) has caught anything before this. It would be a rare (though I admit not
impossible) coincidence if we got catastrophic, undetected memory corruption a
week after attaching a new raid controller with a new disk array.