Re: bio linked list corruption.
From: Linus Torvalds
Date: Tue Oct 18 2016 - 20:29:14 EST
On Tue, Oct 18, 2016 at 5:10 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Adding Andy to the cc, because this *might* be triggered by the
> vmalloc stack code itself. Maybe the re-use of stacks showing some
> problem? Maybe Chris (who can't see the problem) doesn't have
> CONFIG_VMAP_STACK enabled?
I bet it's the plug itself that is the stack address. In fact, it's
probably that mq_list head pointer
I think every single users of block plugging uses the pattern
struct blk_plug plug;
blk_start_plug(&plug);
and then we'll have
INIT_LIST_HEAD(&plug->mq_list);
which initializes that mq_list head with the stack addresses pointing to itself.
So when we see something like this:
list_add corruption. prev->next should be next (ffffe8ffff806648),
but was ffffc9000067fcd8. (prev=ffff880503878b80)
and it comes from
list_add_tail(&rq->queuelist, &plug->mq_list);
which will expand to
__list_add(new, head->prev, head)
which in this case *should* be:
__list_add(&rq->queuelist, plug->mq_list.prev, &plug->mq_list);
so in fact we *should* have "next" be a stack address.
So that debug message is really really odd. I would expect that "next"
is the stack address (because we're adding to the tail of the list, so
"next" is the list head itself), but the debug message corruption
printout says that "was" is the stack address, but next isn't.
Weird.The "but was" value actually looks like the right address should
look, but the actual address (which *should* be just "&plug->mq_list"
and really should be on the stack) looks bogus.
I'm now very confused.
Linus