Re: [PATCH 14/15] mm: Align THP mappings for non-DAX

From: William Kucharski
Date: Tue Oct 01 2019 - 12:08:51 EST




> On Oct 1, 2019, at 8:20 AM, Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
>
> On Tue, Oct 01, 2019 at 06:18:28AM -0600, William Kucharski wrote:
>>
>>
>> On 10/1/19 5:32 AM, Kirill A. Shutemov wrote:
>>> On Tue, Oct 01, 2019 at 05:21:26AM -0600, William Kucharski wrote:
>>>>
>>>>
>>>>> On Oct 1, 2019, at 4:45 AM, Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
>>>>>
>>>>> On Tue, Sep 24, 2019 at 05:52:13PM -0700, Matthew Wilcox wrote:
>>>>>>
>>>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>>>> index cbe7d0619439..670a1780bd2f 100644
>>>>>> --- a/mm/huge_memory.c
>>>>>> +++ b/mm/huge_memory.c
>>>>>> @@ -563,8 +563,6 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
>>>>>>
>>>>>> if (addr)
>>>>>> goto out;
>>>>>> - if (!IS_DAX(filp->f_mapping->host) || !IS_ENABLED(CONFIG_FS_DAX_PMD))
>>>>>> - goto out;
>>>>>>
>>>>>> addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE);
>>>>>> if (addr)
>>>>>
>>>>> I think you reducing ASLR without any real indication that THP is relevant
>>>>> for the VMA. We need to know if any huge page allocation will be
>>>>> *attempted* for the VMA or the file.
>>>>
>>>> Without a properly aligned address the code will never even attempt allocating
>>>> a THP.
>>>>
>>>> I don't think rounding an address to one that would be properly aligned to map
>>>> to a THP if possible is all that detrimental to ASLR and without the ability to
>>>> pick an aligned address it's rather unlikely anyone would ever map anything to
>>>> a THP unless they explicitly designate an address with MAP_FIXED.
>>>>
>>>> If you do object to the slight reduction of the ASLR address space, what
>>>> alternative would you prefer to see?
>>>
>>> We need to know by the time if THP is allowed for this
>>> file/VMA/process/whatever. Meaning that we do not give up ASLR entropy for
>>> nothing.
>>>
>>> For instance, if THP is disabled globally, there is no reason to align the
>>> VMA to the THP requirements.
>>
>> I understand, but this code is in thp_get_unmapped_area(), which is only called
>> if THP is configured and the VMA can support it.
>>
>> I don't see it in Matthew's patchset, so I'm not sure if it was inadvertently
>> missed in his merge or if he has other ideas for how it would eventually be
>> called, but in my last patch revision the code calling it in do_mmap()
>> looked like this:
>>
>> #ifdef CONFIG_RO_EXEC_FILEMAP_HUGE_FAULT_THP
>> /*
>> * If THP is enabled, it's a read-only executable that is
>> * MAP_PRIVATE mapped, the length is larger than a PMD page
>> * and either it's not a MAP_FIXED mapping or the passed address is
>> * properly aligned for a PMD page, attempt to get an appropriate
>> * address at which to map a PMD-sized THP page, otherwise call the
>> * normal routine.
>> */
>> if ((prot & PROT_READ) && (prot & PROT_EXEC) &&
>> (!(prot & PROT_WRITE)) && (flags & MAP_PRIVATE) &&
>> (!(flags & MAP_FIXED)) && len >= HPAGE_PMD_SIZE) {
>
> len and MAP_FIXED is already handled by thp_get_unmapped_area().
>
> if (prot & (PROT_READ|PROT_WRITE|PROT_READ) == (PROT_READ|PROT_EXEC) &&
> (flags & MAP_PRIVATE)) {

It is, but I wanted to avoid even calling it if conditions weren't right.

Checking twice is non-optimal but I didn't want to alter the existing use of
the routine for anon THP.

>
>
>> addr = thp_get_unmapped_area(file, addr, len, pgoff, flags);
>>
>> if (addr && (!(addr & ~HPAGE_PMD_MASK))) {
>
> This check is broken.
>
> For instance, if pgoff is one, (addr & ~HPAGE_PMD_MASK) has to be equal to
> PAGE_SIZE to have chance to get a huge page in the mapping.
>

If the address isn't PMD-aligned, we will never be able to map it with a THP
anyway.

The current code is designed to only map a THP if the VMA allows for it and
it can map the entire THP starting at an aligned address.

You can't map a THP at the PMD level at an address that isn't PMD aligned.

Perhaps I'm missing a use case here.

>> /*
>> * If we got a suitable THP mapping address, shut off
>> * VM_MAYWRITE for the region, since it's never what
>> * we would want.
>> */
>> vm_maywrite = 0;
>
> Wouldn't it break uprobe, for instance?

I'm not sure; does uprobe allow COW to insert the probe even for mappings
explicitly marked read-only?

Thanks,
Bill