Re: [PATCH v2] dax: fix NULL pointer in __dax_pmd_fault()

From: Dave Chinner
Date: Tue Sep 22 2015 - 19:30:39 EST


On Tue, Sep 22, 2015 at 02:25:19PM -0700, Dan Williams wrote:
> On Tue, Sep 22, 2015 at 2:13 PM, Andrew Morton
> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Tue, 22 Sep 2015 13:36:22 -0600 Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> wrote:
> >
> >> The following commit:
> >>
> >> commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for
> >> DAX")
> >>
> >> moved some code in __dax_pmd_fault() that was responsible for zeroing
> >> newly allocated PMD pages. The new location didn't properly set up
> >> 'kaddr', though, so when run this code resulted in a NULL pointer BUG.
> >>
> >> Fix this by getting the correct 'kaddr' via bdev_direct_access().
> >
> > Why the heck didn't gcc warn?
> >
> > I had a fiddle:
> >
> > --- a/fs/dax.c~a
> > +++ a/fs/dax.c
> > @@ -529,15 +529,18 @@ int __dax_pmd_fault(struct vm_area_struc
> > unsigned long pmd_addr = address & PMD_MASK;
> > bool write = flags & FAULT_FLAG_WRITE;
> > long length;
> > - void __pmem *kaddr;
> > + void *kaddr;
> > pgoff_t size, pgoff;
> > sector_t block, sector;
> > unsigned long pfn;
> > int result = 0;
> >
> > +// printk("%p\n", kaddr);
> > +
> > /* Fall back to PTEs if we're going to COW */
> > if (write && !(vma->vm_flags & VM_SHARED))
> > return VM_FAULT_FALLBACK;
> > + printk("%p\n", kaddr);
> > /* If the PMD would extend outside the VMA */
> > if (pmd_addr < vma->vm_start)
> > return VM_FAULT_FALLBACK;
> >
> > gcc warns about the first printk, but not about the second. So that
> > "if (...) return ..." seems to have defeated gcc uninitialized-var
> > detection. wtf?
> >
> >> --- a/fs/dax.c
> >> +++ b/fs/dax.c
> >> @@ -569,8 +569,20 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
> >> if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE)
> >> goto fallback;
> >>
> >> + sector = bh.b_blocknr << (blkbits - 9);
> >> +
> >> if (buffer_unwritten(&bh) || buffer_new(&bh)) {
> >> int i;
> >> +
> >> + length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn,
> >> + bh.b_size);
> >> + if (length < 0) {
> >> + result = VM_FAULT_SIGBUS;
> >> + goto out;
> >> + }
> >> + if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR))
> >> + goto fallback;
> >> +
> >> for (i = 0; i < PTRS_PER_PMD; i++)
> >> clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE);
> >> wmb_pmem();
> >
> > hm, that's a lot of copy-n-paste. Do we really need to run
> > bdev_direct_access() twice? Will `kaddr' and `pfn' change?
> >
>
> They shouldn't change, but I'm working on a fix for handling the race
> of unbinding the pmem device while that kaddr is in use (unbind
> invalidates kaddr).

Exactly what does "unbinding the pmem device" mean, and why can
(parts of) the pmem device "go away" when there are active
references to it?

> The proposal is a dax_map_bh()/dax_unmap_bh()
> interface to temporarily pin the mapping around each usage.

Which mapping? The bufferhead maps file offset to filesystem block
addresses, so I'm not sure what problem you are actually refering
to here...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/