Re: [RFC PATCH] slow-work: add (module*)work->owner to fix raceswith module clients

From: Gregory Haskins
Date: Wed Jun 24 2009 - 18:02:21 EST


David Howells wrote:
> Gregory Haskins <ghaskins@xxxxxxxxxx> wrote:
>
>
>> I found this while working on KVM. I actually posted this patch with
>> a KVM
>> series yesterday and standalone earlier today, but neither seems to have
>> made it to the lists. I suspect there is an issue with git-mail/postfix
>> on my system.
>>
>
> Also, your mail client has damaged the whitespace in the patch.
>

Yeah, sorry about that. When git-mail was failing I cut-n-pasted into
thunderbird and it munged it a bit. v2 should be better as it came out
of git directly after I fixed the postfix misconfig.

>
>> struct slow_work {
>> + struct module *owner;
>>
>
> Can you add it to slow_work_ops instead?
>

Yeah, that makes sense.
>
>> work->ops->put_ref(work);
>> + barrier(); /* ensure that put_ref is not re-ordered with module_put =
>> */
>> + module_put(work->owner);
>>
>
> Ummm... Can it be? module_put() and put_ref() are both out of line - surely
> the compiler isn't allowed to reorder them? If it's the CPU doing it then
> barrier() isn't going to save you.
>

Good point. I added that at the last minute without engaging my brain.
:) Will remove.

> Note, however, that work may not be dereferenced like this after put_ref() is
> called, unless you're sure that there's still a reference outstanding.
>
>
Yeah, I noticed that too immediately after sending. It should be better
in v2 (which should be in your inbox already)

>> + if (!try_module_get(work->owner))
>> + goto cant_get_mod;
>>
>
> Note that this may result in a module getting stuck in unloading. It may need
> to do some work to complete the unload, and this will prevent that.
>

Can we set the stake in the ground that you can only call
slow_work_enqueue() from a module if you know that there is at least one
reference to the module being held? This seems like a core requirement
anyway.

The follow up question would be: if so, should we use __module_get()
instead ot try_module_get() to annotate that (in addition to a comment,
of course).

> A better way might be to have put_ref() return, say, a pointer to a completion
> struct, and if not NULL, have the caller of put_ref() call complete() on it.
> That way you don't need to touch the module count, but can have something in
> put_ref() keep track of when the last object is released and have its caller
> invoke a completion to celebrate this fact.
>

That sounds interesting, but I am not sure if we would get into a
similar conundrum or be awkward to manage. I am in a conf-call ATM so I
can't think clear enough to tell for sure. ;) Let me give it some
thought and get back to you, though.

Thanks David!
-Greg


Attachment: signature.asc
Description: OpenPGP digital signature