Re: [PATCH 2/2] block, dax: fix lifetime of in-kernel dax mappings

From: Dan Williams
Date: Mon Oct 05 2015 - 21:57:26 EST


On Wed, Sep 30, 2015 at 4:35 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Tue, Sep 29, 2015 at 08:41:36PM -0400, Dan Williams wrote:
>> The DAX implementation needs to protect new calls to ->direct_access()
>> and usage of its return value against unbind of the underlying block
>> device. Use blk_queue_enter()/blk_queue_exit() to either prevent
>> blk_cleanup_queue() from proceeding, or fail the dax_map_bh() if the
>> request_queue is being torn down.
>>
>> Cc: Jens Axboe <axboe@xxxxxxxxx>
>> Cc: Christoph Hellwig <hch@xxxxxx>
>> Cc: Boaz Harrosh <boaz@xxxxxxxxxxxxx>
>> Cc: Dave Chinner <david@xxxxxxxxxxxxx>
>> Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
>> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
>> ---
>> block/blk.h | 2 -
>> fs/dax.c | 130 +++++++++++++++++++++++++++++++-----------------
>> include/linux/blkdev.h | 2 +
>> 3 files changed, 85 insertions(+), 49 deletions(-)
>>
>> diff --git a/block/blk.h b/block/blk.h
>> index 5b2cd393afbe..0f8de0dda768 100644
>> --- a/block/blk.h
>> +++ b/block/blk.h
>> @@ -72,8 +72,6 @@ void blk_dequeue_request(struct request *rq);
>> void __blk_queue_free_tags(struct request_queue *q);
>> bool __blk_end_bidi_request(struct request *rq, int error,
>> unsigned int nr_bytes, unsigned int bidi_bytes);
>> -int blk_queue_enter(struct request_queue *q, gfp_t gfp);
>> -void blk_queue_exit(struct request_queue *q);
>> void blk_freeze_queue(struct request_queue *q);
>>
>> static inline void blk_queue_enter_live(struct request_queue *q)
>> diff --git a/fs/dax.c b/fs/dax.c
>> index bcfb14bfc1e4..7ce002bb60d0 100644
>> --- a/fs/dax.c
>> +++ b/fs/dax.c
>> @@ -63,12 +63,42 @@ int dax_clear_blocks(struct inode *inode, sector_t block, long size)
>> }
>> EXPORT_SYMBOL_GPL(dax_clear_blocks);
>>
>> -static long dax_get_addr(struct buffer_head *bh, void __pmem **addr,
>> - unsigned blkbits)
>> +static void __pmem *__dax_map_bh(const struct buffer_head *bh, unsigned blkbits,
>> + unsigned long *pfn, long *len)
>
> Please don't use bufferheads for this. Please pass an inode, the
> block and length to map, similar to dax_clear_blocks().
>
> Why? Because dax_clear_blocks() needs to do this "mapping" too,
> and it is called from contexts where there are no bufferheads.
> There's a good chance we'll need more mapping contexts like this in
> future, so lets not propagate bufferheads deeper into this code
> than we absilutely need to.
>
> We should be trying to limit/remove bufferheads in the DAX code, not
> propagating them deeper into the code...

So I gave this a try but ran into the road block that get_block() is
performing the inode to bdev conversion and that is filesystem
specific. However, I'll at least not pass the bh into this map
routine and will call it dax_map_atomic() which is more accurate.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/