Re: mm: GPF in bdi_put

From: Al Viro
Date: Tue Feb 28 2017 - 14:05:12 EST


On Tue, Feb 28, 2017 at 06:55:55PM +0100, Dmitry Vyukov wrote:

> I am also seeing the following crashes on
> linux-next/8d01c069486aca75b8f6018a759215b0ed0c91f0. Do you think it's
> the same underlying issue?

Yes.
1) Any attempt of mount -t bdev will fail, as it should
2) bdevfs instance created by that attempt will be immediately
destroyed (again, as it should)
3) the sole inode ever created for that instance (its root directory)
will be destroyed in process (again, as it should)
4) that inode has never had ->bd_bdi initialized - the value stored
there would have been whatever garbage kmem_cache_alloc() has left behind
5) bdev_evict_inode() will be called for that inode and if
aforementioned garbage happens to be not equal to &noop_backing_dev_info,
the pointer will be passed to bdi_put().

If that inode happens to reuse the memory previously occupied by a bdev
inode of a looked up but never opened block device, it will have ->bd_bdi
still equal to &noop_backing_dev_info, so that crap does not trigger
every time. That's what the junk (recvmsg/ioctl/etc.) in your reproducer
is affecting. Specific effects of bdi_put() will, of course, depend upon
the actual garbage found there - silent decrement of refcount of an existing
bdi setting the things up for later use-after-free, outright memory
corruption, etc.

_Any_ stack trace of form sys_mount() -> ... -> bdev_evict_inode() ->
bdi_put() -> <barf> is almost certainly the same bug.

I would still like to hear from Jan regarding the reasons why we do that
bdi_put() from bdev_evict_inode() and not in __blkdev_put(). My preference
would be to do it there (and reset ->bd_bdi to &noop_backing_dev_info) when
->bd_openers hits 0. And drop that code from bdev_evict_inode()...

Objections?