Re: [Question] Missing data after DMA read transfer - mm issue with transparent huge page?

From: Nicolas Morey-Chaisemartin
Date: Thu May 12 2016 - 09:30:42 EST




Le 05/12/2016 à 11:36 AM, Jerome Glisse a écrit :
> On Thu, May 12, 2016 at 08:07:59AM +0200, Nicolas Morey-Chaisemartin wrote:
>>
>> Le 05/11/2016 à 04:51 PM, Jerome Glisse a écrit :
>>> On Wed, May 11, 2016 at 01:15:54PM +0200, Nicolas Morey Chaisemartin wrote:
>>>> Le 05/10/2016 à 12:01 PM, Jerome Glisse a écrit :
>>>>> On Tue, May 10, 2016 at 09:04:36AM +0200, Nicolas Morey Chaisemartin wrote:
>>>>>> Le 05/03/2016 à 12:11 PM, Jerome Glisse a écrit :
>>>>>>> On Mon, May 02, 2016 at 09:04:02PM -0700, Hugh Dickins wrote:
>>>>>>>> On Fri, 29 Apr 2016, Nicolas Morey Chaisemartin wrote:
>>>> [...]
>>>>>> Hi,
>>>>>>
>>>>>> I backported the patch to 3.10 (had to copy paste pmd_protnone defitinition from 4.5) and it's working !
>>>>>> I'll open a ticket in Redhat tracker to try and get this fixed in RHEL7.
>>>>>>
>>>>>> I have a dumb question though: how can we end up in numa/misplaced memory code on a single socket system?
>>>>>>
>>>>> This patch is not a fix, do you see bug message in kernel log ? Because if
>>>>> you do that it means we have a bigger issue.
>>>>>
>>>>> You did not answer one of my previous question, do you set get_user_pages
>>>>> with write = 1 as a paremeter ?
>>>>>
>>>>> Also it would be a lot easier if you were testing with lastest 4.6 or 4.5
>>>>> not RHEL kernel as they are far appart and what might looks like same issue
>>>>> on both might be totaly different bugs.
>>>>>
>>>>> If you only really care about RHEL kernel then open a bug with Red Hat and
>>>>> you can add me in bug-cc <jglisse@xxxxxxxxxx>
>>>>>
>>>>> Cheers,
>>>>> Jérôme
>>>> I finally managed to get a proper setup.
>>>> I build a vanilla 4.5 kernel from git tree using the Centos7 config, my test fails as usual.
>>>> I applied your patch, rebuild => still fails and no new messages in dmesg.
>>>>
>>>> Now that I don't have to go through the RPM repackaging, I can try out things much quicker if you have any ideas.
>>>>
>>> Still an issue if you boot with transparent_hugepage=never ?
>>>
>>> Also to simplify investigation force write to 1 all the time no matter what.
>>>
>>> Cheers,
>>> Jérôme
>> With transparent_hugepage=never I can't see the bug anymore.
>>
> Can you test https://patchwork.kernel.org/patch/9061351/ with 4.5
> (does not apply to 3.10) and without transparent_hugepage=never
>
> Jérôme

Fails with 4.5 + this patch and with 4.5 + this patch + yours

Nicolas