Re: [PATCH v2 2/4] mm/hugetlb: Setup hugetlb_falloc during fallocate hole punch

From: Mike Kravetz
Date: Tue Oct 20 2015 - 21:03:10 EST

Next message: Ken Xue: "Re: [PATCH 1/2] i2c: designware: register clkdev during acpi device configuration"
Previous message: Rob Herring: "Re: [PATCH v4 1/4] dt-bindings: Document the STM32 DMA bindings"
In reply to: Dave Hansen: "Re: [PATCH v2 2/4] mm/hugetlb: Setup hugetlb_falloc during fallocate hole punch"
Next in thread: Mike Kravetz: "[PATCH v2 4/4] mm/hugetlb: Unmap pages to remove if page fault raced with hole punch"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 10/20/2015 05:11 PM, Dave Hansen wrote:
> On 10/20/2015 04:52 PM, Mike Kravetz wrote:
>> if (hole_end > hole_start) {
>> struct address_space *mapping = inode->i_mapping;
>> + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(hugetlb_falloc_waitq);
>> + /*
>> + * Page faults on the area to be hole punched must be stopped
>> + * during the operation. Initialize struct and have
>> + * inode->i_private point to it.
>> + */
>> + struct hugetlb_falloc hugetlb_falloc = {
>> + .waitq = &hugetlb_falloc_waitq,
>> + .start = hole_start >> hpage_shift,
>> + .end = hole_end >> hpage_shift
>> + };
> ...
>> @@ -527,6 +550,12 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
>> hole_end >> PAGE_SHIFT);
>> i_mmap_unlock_write(mapping);
>> remove_inode_hugepages(inode, hole_start, hole_end);
>> +
>> + spin_lock(&inode->i_lock);
>> + inode->i_private = NULL;
>> + wake_up_all(&hugetlb_falloc_waitq);
>> + spin_unlock(&inode->i_lock);
>
> I see the shmem code doing something similar. But, in the end, we're
> passing the stack-allocated 'hugetlb_falloc_waitq' over to the page
> faulting thread. Is there something subtle that keeps
> 'hugetlb_falloc_waitq' from becoming invalid while the other task is
> sleeping?
>
> That wake_up_all() obviously can't sleep, but it seems like the faulting
> thread's finish_wait() *HAS* to run before wake_up_all() can return.
>

The 'trick' is noted in the comment in the shmem_fault code:

/*
* shmem_falloc_waitq points into the
shmem_fallocate()
* stack of the hole-punching task:
shmem_falloc_waitq
* is usually invalid by the time we reach here, but
* finish_wait() does not dereference it in that
case;
* though i_lock needed lest racing with
wake_up_all().
*/

The faulting thread is removed from the waitq when awakened with
wake_up_all(). See the DEFINE_WAIT() and supporting code in the
faulting thread. Because of this, when the faulting thread calls
finish_wait() it does not access the waitq that was/is on the stack.

At least I've convinced myself it works this way. :)

--
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Ken Xue: "Re: [PATCH 1/2] i2c: designware: register clkdev during acpi device configuration"
Previous message: Rob Herring: "Re: [PATCH v4 1/4] dt-bindings: Document the STM32 DMA bindings"
In reply to: Dave Hansen: "Re: [PATCH v2 2/4] mm/hugetlb: Setup hugetlb_falloc during fallocate hole punch"
Next in thread: Mike Kravetz: "[PATCH v2 4/4] mm/hugetlb: Unmap pages to remove if page fault raced with hole punch"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]