Re: [PATCH 2/2] dax: fix bdev NULL pointer dereferences

From: Ross Zwisler
Date: Sun Jan 31 2016 - 01:12:18 EST


> On Jan 30, 2016, at 7:32 PM, Matthew Wilcox <willy@xxxxxxxxxxxxxxx> wrote:
>
>> On Fri, Jan 29, 2016 at 10:01:13PM -0800, Dan Williams wrote:
>>> On Fri, Jan 29, 2016 at 9:28 PM, Matthew Wilcox <willy@xxxxxxxxxxxxxxx> wrote:
>>> If we store the PFN of the underlying page instead, we don't have this
>>> problem. Instead, we have a different problem; of the device going
>>> away under us. I'm trying to find the code which tears down PTEs when
>>> the device goes away, and I'm not seeing it. What do we do about user
>>> mappings of the device?
>>
>> I deferred the dax tear down code until next cycle as Al rightly
>> pointed out some needed re-works:
>>
>> https://lists.01.org/pipermail/linux-nvdimm/2016-January/003995.html
>
> Thanks; I eventually found it in my email somewhere over the Pacific.
>
> I did probably 70% of the work needed to switch the radix tree over to
> storing PFNs instead of sectors. It seems viable, though it's a big
> change from where we are today:

At one point I had kaddrs in the radix tree, so I could just pull the addresses out
and flush them. That would save us a pfn -> kaddrs conversion before flush.

Is there a reason to store pnfs instead of kaddrs in the radix tree?

>
> fs/dax.c | 415 +++++++++++++++++++++++----------------------
> include/linux/dax.h | 3 +-
> include/linux/pfn_t.h | 33 +++-
> include/linux/radix-tree.h | 9 -
> 4 files changed, 236 insertions(+), 224 deletions(-)
>
> I'll try and get that finished off this week.
>
> One concrete and easily-separable piece is that dax_clear_blocks() has
> the wrong signature. It currently takes an inode & block as parameters;
> it has no way of finding out the correct block device. It's only two
> callers are filesystems (ext2 and xfs). Those filesystems should be
> passing the block_device instead of the inode. But without the inode,
> we can't convert a block number to a sector number, so we also need
> to pass the sector number, not the block number. It still has type
> sector_t, annoyingly.
>
> @@ -63,12 +238,11 @@ static void dax_unmap_atomic(struct block_device *bdev,
> * and hence this means the stack from this point must follow GFP_NOFS
> * semantics for all operations.
> */
> -int dax_clear_blocks(struct inode *inode, sector_t block, long _size)
> +int dax_clear_blocks(struct block_device *bdev, sector_t sector, long size)
> {
> - struct block_device *bdev = inode->i_sb->s_bdev;
> struct blk_dax_ctl dax = {
> - .sector = block << (inode->i_blkbits - 9),
> - .size = _size,
> + .sector = sector,
> + .size = size,
> };
>
> might_sleep();
>
> but I haven't looked at doing the conversion of xfs or ext2 to use that
> new interface.
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@xxxxxxxxxxxx
> https://lists.01.org/mailman/listinfo/linux-nvdimm