Re: [PATCH] proc: sysctl: fix double drop_sysctl_table()

From: Al Viro
Date: Fri Dec 21 2018 - 22:18:10 EST

On Sat, Dec 22, 2018 at 10:12:21AM +0800, Yafang Shao wrote:
> The callers of insert_header() will drop sysctl table whatever
> insert_header() successes or fails.
> So we can't call drop_sysctl_table() in insert_header().
> The call sites of insert_header() are all in fs/proc/proc_sysctl.c.

Except that at least insert_links() does this:

err = insert_header(core_parent, links);
if (err)
with that drop clearly paired with explicit increment of nreg. So your
patch appears to break at least that one.

Looking at the rest of callers... __register_sysctl_table() demonstrates
the same pattern - explicit increment of nreg, then insert_header(),
then *unconditional* drop undoing that increment.

The third and last caller (get_subdir()) appears to be different.
We do insert_header(), if it succeeds we bump nreg on new and
unconditionally drop a reference to dir, as well as new.

Let's look at the callers of that sucker.

/* Reference moved down the diretory tree get_subdir */

/* Find the directory for the ctl_table */
for (name = path; name; name = nextname) {
int namelen;
nextname = strchr(name, '/');
if (nextname) {
namelen = nextname - name;
} else {
namelen = strlen(name);
if (namelen == 0)

dir = get_subdir(dir, name, namelen);
if (IS_ERR(dir))
goto fail;

if (insert_header(dir, header))
goto fail_put_dir_locked;

Aha... So we are traversing a tree and it smells like get_subdir()
expects dir to be pinned by the caller, drops that reference and
either fails or returns the next tree node pinned.

_IF_ that matches the behaviour of get_subdir(), the code above
makes sense -
grab dir
for each non-empty pathname component
next = get_subdir(dir, component)
// dir got dropped
if get_subdir has failed
goto fail
// next got grabbed
dir = next
insert_header(dir, header)
drop dir
if insert_header was successful
return header
// all references grabbed by the above are dropped by now

So let's look at get_subdir() from refcounting POV (ignoring the locking
for now):
subdir = find_subdir(dir, name, namelen);
if (!IS_ERR(subdir))
goto found;
if (PTR_ERR(subdir) != -ENOENT)
goto failed;
yeccchhhhhhh.... What's wrong with if (subdir == ERR_PTR(-ENOENT))?
Anyway, find_subdir() appears to be refcounting-neutral.
new = new_dir(set, name, namelen);
OK, looks like we are creating a new object here. What's the value of .nreg
in it? new_dir() itself doesn't seem to set it, but the thing it calls at
the end (init_header()) does set it to 1. OK, so we'd got a reference to
new object, our counter being 1. On failure it appears to return NULL.
subdir = ERR_PTR(-ENOMEM);
if (!new)
goto failed;
right, so if allocation in new_dir() has failed, we are buggering off to the
same 'failed' label as on other exits. In failure case new_dir() is
refcounting-neutral, so we are in the same environment.
/* Was the subdir added while we dropped the lock? */
subdir = find_subdir(dir, name, namelen);
if (!IS_ERR(subdir))
goto found;
if (PTR_ERR(subdir) != -ENOENT)
goto failed;
Interesting... So we can get to 'failed' in some cases when new_dir()
has not failed.
/* Nope. Use the our freshly made directory entry. */
err = insert_header(dir, &new->header);
subdir = ERR_PTR(err);
if (err)
goto failed;
Looks like it expects insert_header() to be refcounting-neutral in failure
subdir = new;

OK, we can get here from 3 places:
* subdir found by lookup before we even tried to allocate new.
new is NULL, subdir has just gotten a reference grabbed.
* subdir found by re-lookup after new had been allocated.
new is non-NULL and we are holding one reference to it, subdir has
just gotten a reference grabbed and it's not equal to new.
* new got successfully inserted into dir; subdir is equal
to new and we'd just grabbed the second reference to it.

Looks like in all those cases we have a reference grabbed on subdir
*and* a reference grabbed on new (if non-NULL). Covers all 3 cases.,,

... and now we rejoin the failure paths (a lousy label name, that - we
get here on success as well).
if (IS_ERR(subdir))
drop reference to dir (the one that caller had grabbed for us)
if new is not NULL
drop reference to new.
return subdir.

OK, in success case we have ended up with the total effect
drop dir
grab subdir
Holds in all 3 cases above. On failure, we have dropped the reference
grabbed by new_dir (if any) and dropped dir. That matches the behaviour
guessed by the look of the caller, and it definitely expects
insert_header() to be refcounting-neutral on failures.

And AFAICS, insert_header() is that. Before your patch, that is...

The bottom line is, proposed behaviour change appears to be broken for all


PS: I was going to add that get_subdir() was in bad need of comments, until
I actually looked around. And right in front of that function we have this:
* get_subdir - find or create a subdir with the specified name.
* @dir: Directory to create the subdirectory in
* @name: The name of the subdirectory to find or create
* @namelen: The length of name
* Takes a directory with an elevated reference count so we know that
* if we drop the lock the directory will not go away. Upon success
* the reference is moved from @dir to the returned subdirectory.
* Upon error an error code is returned and the reference on @dir is
* simply dropped.

So it *IS* documented, description does match the behaviour and
I should've bloody well checked if it's there first. And verified
that it does match the code, of course, but that's generally
simpler than deriving the behaviour from code and trying to come
up with sane description of such.