Re: [GIT PULL] parisc huge page support for v4.4

From: Helge Deller
Date: Sat Dec 26 2015 - 07:33:00 EST


On 26.12.2015 13:09, Mikulas Patocka wrote:
>> On Tue, 24 Nov 2015, Helge Deller wrote:
>>> * Mikulas Patocka <mpatocka@xxxxxxxxxx>:
>>>> On Tue, 24 Nov 2015, Helge Deller wrote:
>>>>>> Hi
>>>>>>
>>>>>> Since the kernel 4.4-rc2 I'm getting frequent boot failures on PA-RISC.
>>>>>> When I revert this patchset, the crashes are gone.
>>>>>
>>>>>> [ 3.296666] CPU(s): 4 out of 4 PA8900 (Shortfin) at 1000.000000 MHz online
>>>>>
>>>>> Hi Mikulas,
>>>>>
>>>>> Yes, I've seen this as well.
>>>>> It affects only the PA8900 CPUs, while all PA8500-PA8700 machines seem to work fine.
>>>>> I do have a temporary 3-line patch to avoid the crashes which I'll push to my tree shortly.
>>>>> I'm still investigating why it only affects the PA8900 CPUs, but I assume
>>>>> it's related to the cache aliasing of those CPUs.
>>>>> I'll keep you updated.
>>>>>
>>>>> Helge
>>>>
>>>> The PA-RISC specification doesn't allow aliasing on non-equaivalent
>>>> addresses. Can the kernel map a piece of kernel data to other virtual
>>>> address? If yes, we can't use big pages to map kernel data.
>>>
>>> Can you please try the two patches below?
>>> The first one disables mapping kernel text/data on huge pages on
>>> PA8800/PA8900 CPUs. Patch works for me on my Mako PA8800.
>>>
>>> Independend of my huge page patch the second patch disables the tlb
>>> flush optimization we added earlier. It seems calling flush_tlb_all()
>>> doesn't reliably flushes tlbs on all CPUs so it's better to fall back to
>>> the loop implementation.
>>>
>>> Helge
>>
>> The kernel with these patches works fine so far.
>>
>> Mikulas
>
> BTW. I looked at this in arch/parisc/mm/hugetlbpage.c:set_huge_pte_at
> "*ptep = entry;" and it seems like a bad bug. PA-RISC doesn't have atomic
> instructions to modify page table entries, so it takes spinlock in the TLB
> handler and modifies the page table entry non-atomically. If you modify
> the page table entry without the spinlock, you may race with TLB handler
> on another CPU and your modification may be lost.

Right.

> The comment says something about double locking on pa_tlb_lock, but
> pa_tlb_lock isn't held when that function is called.

I have a work-in-progress patch for that in one of my trees, e.g.:
http://git.kernel.org/cgit/linux/kernel/git/deller/parisc-linux.git/commit/?h=parisc-next&id=5c76b525cbdb097401f46522b27b1eb6244f34f9
It's lightly tested though.

Helge

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/