Re: [PATCH v2 2/5] dax: increase granularity of dax_clear_blocks() operations
From: Williams, Dan J
Date: Thu Oct 22 2015 - 18:57:45 EST
On Thu, 2015-10-22 at 17:04 -0400, Jeff Moyer wrote:
> Dan Williams <dan.j.williams@xxxxxxxxx> writes:
>
> > dax_clear_blocks is currently performing a cond_resched() after every
> > PAGE_SIZE memset. We need not check so frequently, for example md-raid
> > only calls cond_resched() at stripe granularity. Also, in preparation
> > for introducing a dax_map_atomic() operation that temporarily pins a dax
> > mapping move the call to cond_resched() to the outer loop.
>
> There's nothing wrong with the mechanics here, but why bother? I only
> see 1 caller in the kernel, and that caller passes in
> 1<<inode->i_blkbits for the size (so 1 page or less). Did you plan to
> add other callers? I don't see them in this particular patch set.
>
> Again, I'm not taking issue with the patch, I'm just wondering what
> motivated the change.
The motivation is the subsequent patch to wrap all touches of pmem
within a dax_map_atomic() / dax_unmap_atomic() pairing. If I just do
the straightforward conversion of this function to dax_map_atomic() it
looks something like this:
> diff --git a/fs/dax.c b/fs/dax.c
> index 5dc33d788d50..fa2a2a255d3a 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -40,9 +40,9 @@ int dax_clear_blocks(struct inode *inode, sector_t block, long size)
> unsigned long pfn;
> long count;
>
> - count = bdev_direct_access(bdev, sector, &addr, &pfn, size);
> - if (count < 0)
> - return count;
> + addr = __dax_map_atomic(bdev, sector, size, &pfn, &count);
> + if (IS_ERR(addr))
> + return PTR_ERR(addr);
> BUG_ON(size < count);
> while (count > 0) {
> unsigned pgsz = PAGE_SIZE - offset_in_page(addr);
> @@ -56,6 +56,7 @@ int dax_clear_blocks(struct inode *inode, sector_t block, long size)
> sector += pgsz / 512;
> cond_resched();
> }
> + dax_unmap_atomic(bdev, addr);
> } while (size);
>
> wmb_pmem();
The problem is that intervening call to cond_resched(). I later want to
inject an rcu_read_lock()/unlock() pair to allow flushing active
dax_map_atomic() usages at driver teardown time [1]. But, I think the
patch stands alone as a cleanup outside of that admittedly hidden
motivation.
[1]: "mm, pmem: devm_memunmap_pages(), truncate and unmap ZONE_DEVICE
pages"
https://lists.01.org/pipermail/linux-nvdimm/2015-October/002406.html
N§²æ¸yú²X¬¶ÇvØ)Þ{.nÇ·¥{±êX§¶¡Ü}©²ÆzÚj:+v¨¾«êZ+Êzf£¢·h§~Ûÿû®w¥¢¸?¨è&¢)ßfùy§m
á«a¶Úÿ0¶ìå