Re: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5

From: Pete Wyckoff
Date: Sat Dec 22 2007 - 10:25:33 EST


dave@xxxxxxxxxxxxxx wrote on Fri, 21 Dec 2007 17:18 -0500:
> On Fri, 2007-12-21 at 16:52 -0500, David Dillow wrote:
> > I'm getting the following oops when doing the following commands:
> >
> > modprobe ib_srp
> > <add targets(s) to ib_srp using sysfs>
> > rmmod ib_srp
> > modprobe ib_srp
> > <OOPS>
> >
> > I'm going to try and track down how the list is getting corrupted; it
> > looks like attribute_container_list in
> > drivers/base/attribute_container.c is the one getting corrupted.
>
> Ok, found the culprit, now to figure out the motive and fix it.
>
> ib_srp's srp_cleanup_module calls srp_release_transport(), which calls
> transport_container_unregister() for the rport_attr_cont member of
> struct srp_internal.
>
> That last unregister call is returning -EBUSY, but it gets ignored, and
> the list node gets erased (or just reused) when the module's text/memory
> is free'd.
>
> Now, to see if ib_srp should be waiting for everything to be destroyed
> before calling srp_release_transport(), or if it is just not removing
> some attributes properly.

I don't see where srp_cleanup_module() is calling srp_remove_host().
That is the likely way that transport devices should be made to go
away. Something on the order of srp_remove_work().

Or srp_remove_one() except with a call to srp_remove_host() may be
necessary. In fact, maybe just adding that call will fix it, as
ib_unregister_client should drive the remove function. Guesses, all
this.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/