Re: kernel BUG at fs/inode.c:LINE!

From: Al Viro
Date: Wed Apr 10 2019 - 22:23:10 EST


On Thu, Apr 11, 2019 at 08:50:17AM +0800, Ian Kent wrote:
> On Wed, 2019-04-10 at 14:41 +0200, Dmitry Vyukov wrote:
> > On Wed, Apr 10, 2019 at 2:12 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Apr 10, 2019 at 08:07:15PM +0800, Ian Kent wrote:
> > >
> > > > > I'm unable to find a branch matching the line numbers.
> > > > >
> > > > > Given that, on the face of it, the scenario is impossible I'm
> > > > > seeking clarification on what linux-next to look at for the
> > > > > sake of accuracy.
> > > > >
> > > > > So I'm wondering if this testing done using the master branch
> > > > > or one of the daily branches one would use to check for conflicts
> > > > > before posting?
> > > >
> > > > Sorry those are tags not branches.
> > >
> > > FWIW, that's next-20181214; it is what master had been in mid-December
> > > and master is rebased every day. Can it be reproduced with the current
> > > tree?
> >
> > From the info on the dashboard we know that it happened only once on
> > d14b746c (the second one is result of reproducing the first one). So
> > it was either fixed or just hard to trigger.
>
> Looking at the source of tag next-20181214 in linux-next-history I see
> this is mistake I made due to incorrect error handling which I fixed
> soon after (there was in fact a double iput()).

Right - "autofs: fix possible inode leak in autofs_fill_super()" had been
broken (and completely pointless), leading to double iput() in that failure
case. And yes, double iput() can trigger that BUG_ON(), and with non-zero
odds do so with that stack trace.

As far as I'm concerned, case closed - bug had been in a misguided "fix"
for inexistent leak (coming from misreading the calling conventions for
d_make_root()), introduced in -next at next-20181130 and kicked out of
there in next-20181219. Dropped by Ian's request in
Message-ID: <66d497c00cffb3e4109ca0d5287c8277954d7132.camel@xxxxxxxxxx>
which has fixed that crap. Moreover, that posting had been in reply to
that very syzcaller report, AFAICS.

I don't know how to tell the bot to STFU and close the report in this
situation; up to you, folks.

As an aside, the cause of that bug is that d_make_root() calling conventions
are insufficiently documented. All we have is

||[mandatory]
|| d_alloc_root() is gone, along with a lot of bugs caused by code
||misusing it. Replacement: d_make_root(inode). The difference is,
||d_make_root() drops the reference to inode if dentry allocation fails.

in Documentation/filesystems/porting, and that's not good enough. Anyone
willing to take a shot at that? FWIW, the calling conventions are:

d_make_root(inode) normally allocates and returns a new dentry.
On failure NULL is returned. A reference to inode is consumed in all
cases (on success it is transferred to new dentry, on failure it is
dropped), so failure handling does not need anything done to inode.
d_make_root(NULL) quietly returns NULL, which further simplifies the
error handling in typical caller. Usually it's something like
inode = foofs_new_inode(....);
s->s_root = d_make_inode(inode);
if (!s->s_root)
bugger off, no need to undo inode allocation
success
We do not need to check if foofs_new_inode() has returned NULL and we
do not need any special cleanups in case of failure - not for the
undoing the inode allocation.

If anyone cares to convert that into coherent (and printable) documentation,
patches are welcome...