Re: [Bug] Reproducible data corruption on i5-3340M: Please continueyour great work! :-)

From: Linus Torvalds
Date: Thu Aug 15 2013 - 20:34:04 EST


On Thu, Aug 15, 2013 at 4:05 PM, Ben Tebulin <tebulin@xxxxxxxxxxxxxx> wrote:
>
>> Ben, please test. I'm worried that the problem you see is something
>> even more fundamentally wrong with the whole "oops, must flush in the
>> middle" logic, but I'm _hoping_ this fixes it.
>
> It's gone.
>
> Really!
>
> I git-fsck'ed successfully around 30 times in a row.
> And even all the other things still seem to work ;-)

Goodie. I think I'm just going to commit it (with the speling fixes
for other architectures) asap. It's bigger than I'd like, but it's a
lot simpler than the alternatives of trying to figure out exactly
which call chain got things wrong with the previous confusing model.

Thanks for bisecting and testing.

> Honestly I have to confess that I'm deeply impressed how this finally
> worked out: I just threw a particular, innocent-looking commit hash and
> nothing more into the round.

Being able to bisect the exact commit that introduced the bad behavior
is *very* powerful debugging aid, and in fact the smaller and more
innocent-looking the bisected commit is, the easier it generally is to
then say "ok, it must be related to this one particular issue". So the
bisection really pinpointed the area. After that it was just a matter
of reading the source code and seeing what looked suspicious.

I'll probably delay committing it until tomorrow, in the hope that
somebody using one of the other architectures will at least ack that
it compiles. I'm re-attaching the patch (with the two "logn" -> "long"
fixes) just to encourage that. Hint hint, everybody..

Linus

Attachment: patch.diff
Description: Binary data