Re: Question about (or bug in?) the kobject implementation

From: Alan Stern
Date: Fri Feb 27 2004 - 23:03:38 EST


On Fri, 27 Feb 2004, Greg KH wrote:

> Seriously, once kobject_del() is called, you can't safely call
> kobject_get() anymore on that object.
>
> If you can think of a way we can implement this in the code to prevent
> people from doing this, please send a patch. We've been getting by
> without such a "safeguard" so far...

The problem is unsolvable. Let me explain...

We're actually discussing two different questions here.

A. Is it okay to call kobject_add() after calling kobject_del() --
this was my original question.

B. Can we prevent people from doing kobject_get() after the kobject's
refcount has dropped to 0?

Your earlier response amounted to saying that A isn't good because it
might cause B to happen; once kobject_del() has returned it's possible
that the refcount is 0. But this begs the real question. Suppose we
_know_ that the refcount isn't 0, say because earlier we did an unmatched
kobject_get(). Under those circumstances should it be legal to call
kobject_add() after calling kobject_del()? This is question A'.


Question B can be divided into two subcases.

B1. The code calling kobject_get() knows that the kobject hasn't been
deallocated yet. For example, it might be the cleanup routine
itself calling kobject_get().

Such a thing is legal in Java, but you probably don't want to sanction
such pranks in the driver model. So let's forget about B1. Your big
concern really seems to be:

B2. Everything else; the code calling kobject_get() doesn't know
whether the kobject has been deallocated.

This really is a programming error. It means that kobject_get() has been
passed a possibly stale pointer. Ipso facto, the call to kobject_put()
that decremented the refcount to 0 was made too early, while there were
still active pointers to the kobject floating around.

It's impossible to prevent people from making programming errors or
dereferencing stale pointers. It doesn't matter _what_ code you put in
kobject_get() -- it will crash when given a pointer to a kobject whose
cleanup routine has already run and deallocated the storage.

The best you can do is call people's attention to such errors and fail the
operation gracefully whenever possible (i.e., when it doesn't generate an
addressing error). My personal choice would be to change kobject_get() as
follows:

struct kobject * kobject_get(struct kobject * kobj)
{
if (kobj) {
if (atomic_read(&kobj->refcount) == 0) {
WARN_ON(1);
return NULL;
}
atomic_inc(&kobj->refcount);
}
return kobj;
}

I think that's about the best you can do.

And what's the answer to A'?

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/