Re: [PATCH 2/2] dax: fix bdev NULL pointer dereferences
From: Dan Williams
Date: Fri Jan 29 2016 - 19:18:37 EST
On Fri, Jan 29, 2016 at 3:34 PM, Ross Zwisler
<ross.zwisler@xxxxxxxxxxxxxxx> wrote:
> On Fri, Jan 29, 2016 at 11:28:15AM -0700, Ross Zwisler wrote:
>> On Thu, Jan 28, 2016 at 01:38:58PM -0800, Christoph Hellwig wrote:
>> > On Thu, Jan 28, 2016 at 12:35:04PM -0700, Ross Zwisler wrote:
>> > > There are a number of places in dax.c that look up the struct block_device
>> > > associated with an inode. Previously this was done by just using
>> > > inode->i_sb->s_bdev. This is correct for inodes that exist within the
>> > > filesystems supported by DAX (ext2, ext4 & XFS), but when running DAX
>> > > against raw block devices this value is NULL. This causes NULL pointer
>> > > dereferences when these block_device pointers are used.
>> >
>> > It's also wrong for an XFS file system with a RT device..
>> >
>> > > +#define DAX_BDEV(inode) (S_ISBLK(inode->i_mode) ? I_BDEV(inode) \
>> > > + : inode->i_sb->s_bdev)
>> >
>> > .. but this isn't going to fix it. You must use a bdev returned by
>> > get_blocks or a similar file system method.
>>
>> I guess I need to go off and understand if we can have DAX mappings on such a
>> device. If we can, we may have a problem - we can get the block_device from
>> get_block() in I/O path and the various fault paths, but we don't have access
>> to get_block() when flushing via dax_writeback_mapping_range(). We avoid
>> needing it the normal case by storing the sector results from get_block() in
>> the radix tree.
>>
>> /me is off to play with RT devices...
>
> Well, RT devices are completely broken as far as I can see. I've reported the
> breakage to the XFS list. Anything I do that triggers a RT block allocation
> in XFS causes a lockdep splat + a kernel BUG - I've tried regular pwrite(),
> xfs_rtcp and mmap() + write to address. Not a new bug either - happens just
> the same with v4.4. Happens with both PMEM and BRD, and has no relationship
> to whether I'm using DAX or not.
>
> Does it work for this patch to go in as-is since it fixes an immediate OOPS
> with raw block devices + DAX, and when RT devices are alive again I'll figure
> out how to make them work too?
Can we step back and be clear about which lookups should be coming
from get_blocks(). Which ones are critical vs ones we just
opportunistically lookup for a debug print.
Right now xfs and ext4 are basically disagreeing on whether
get_blocks() reliably sets ->bh_bdev, and checking for a raw
block-device inode in dax_clear_blocks() does not make sense. So this
all seems a bit confused.