Re: AMD TLB errata, (Was: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings)
From: Tom Lendacky
Date: Fri Nov 15 2019 - 09:13:00 EST
On 10/29/19 7:39 AM, Peter Zijlstra wrote:
> On Tue, Oct 29, 2019 at 02:00:24PM +0300, Kirill A. Shutemov wrote:
>> On Tue, Oct 29, 2019 at 09:56:02AM +0100, Peter Zijlstra wrote:
>>> On Tue, Oct 29, 2019 at 09:43:18AM +0300, Kirill A. Shutemov wrote:
>>>> But some CPUs don't like to have two TLB entries for the same memory with
>>>> different sizes at the same time. See for instance AMD erratum 383.
>>>>
>>>> Getting it right would require making the range not present, flush TLB and
>>>> only then install huge page. That's what we do for userspace.
>>>>
>>>> It will not fly for the direct mapping. There is no reasonable way to
>>>> exclude other CPU from accessing the range while it's not present (call
>>>> stop_machine()? :P). Moreover, the range may contain the code that doing
>>>> the collapse or data required for it...
>>>>
>>>> BTW, looks like current __split_large_page() in pageattr.c is susceptible
>>>> to the errata. Maybe we can get away with the easy way...
>>>
>>> As you write above, there is just no way we can have a (temporary) hole
>>> in the direct map.
>>>
>>> We are careful about that other errata, and make sure both translations
>>> are identical wrt everything else.
>>
>> It's not clear if it is enough to avoid the issue. "under a highly specific
>> and detailed set of conditions" is not very specific set of conditions :P
>
> Yeah, I know ... :/ Tom is there any chance you could shed a little more
> light on that errata?
I talked with some of the hardware folks and if you maintain the same bits
in the large and small pages (aside from the large page bit) until the
flush, then the errata should not occur.
The errata really applies to mappings that end up with different attribute
bits being set. Even then, it doesn't fail every time. There are other
conditions required to make it fail.
Thanks,
Tom
>