Re: [bug] ata subsystem related crash with latest -git

From: Linus Torvalds
Date: Wed Oct 17 2007 - 15:31:43 EST




On Wed, 17 Oct 2007, Ingo Molnar wrote:
>
> nope, this did not help. First bootup went fine, second bootup crashed
> again (see below), without hitting the BUG_ON().

I think you'll always hit it if you have a scatter-gather list that is
exactly filled up.

Why? Because those things do "sg_next()" on the last entry, and as
mentioned, that ends up actually accessing one past the end - even if the
end result is not actually ever *used* (because we just effectively
incremented it to past the last entry when the code was done with the SG
list).

So I think the sg_next() interface is fundamentally mis-designed. It
should do the scatter-gather link following on *starting* to use the SG
entry, not after finishing with it.

Put another way: I suspect pretty much every single sg_next() out there is
likely to hit this issue. The way that blk_rq_map_sg() fixed its problem
was exactly to move the "sg_next()" to *before* the use of the SG (and
even that one is somewhat bogus, in that it just blindly assumes that the
first entry is not a link entry).

I suspect the "the next entry is a link" bit should be in the *previous*
entry, and then sg_next() could look like

if (next_is_link(sg))
sg = sg_chain_ptr(sg+1);
else
sg++;
return sg;

and that would work.

The alternative is to always make sure to allocate one more SG entry than
required, so that the last entry is always either the link, or an unused
entry!

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/