On Wed, Apr 25 2007, Brad Campbell wrote:Okay, I left all debugging patches in, disabled all kernel debugging .config stuff and gave it a spin with our usual "killer" workload (as far as batch systems are repeatable anyway) and so far there was not a single glitch or message, so I preliminarily conclude that the bug is squashed. The final word will come once my 1800 batch jobs are processed and I have my particle physics result ;-)Jens Axboe wrote:
It looks to be extremely rare. Aliases are extremely rare, front merges
are rare. And you need both to happen with the details you outlined. But
it's a large user base, and we've had 3-4 reports on this in the past
months. So it obviously does happen. I could not make it trigger without
doctoring the unplug code when I used aio.
Well, not that rare on this particular machine (I had a case where I could
reproduce it in less than an hour of normal use previously on this box),
and I've had it occur a number of times on my servers, I just never
reported it before as I never took the time to set up a serial console and
capture the oops.
Extremely rare in the sense that it takes md and some certain conditions
to happen for it to trigger. So for most people it'll be extremely rare,
and for others (such as yourself) that hit it, it wont be so rare :-)
Here's a fix for it, confirmed.
Shall I leave the other debugging in, apply this and run it for a few hard
days?
Yes, that would be perfect!
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
Attachment:
PGP.sig
Description: This is a digitally signed message part