Re: [PATCH] mm: Fix the problem of mips architecture Oops

From: zhanglianjie
Date: Mon Jun 28 2021 - 01:52:29 EST




On 2021-06-28 09:17, Jiaxun Yang wrote:

在 2021/6/28 上午9:07, zhanglianjie 写道:


On 2021-06-25 21:39, Thomas Bogendoerfer wrote:
On Thu, Jun 24, 2021 at 11:22:12AM +0800, zhanglianjie wrote:
The cause of the problem is as follows:
1. when cat /sys/devices/system/memory/memory0/valid_zones,
    test_pages_in_a_zone() will be called.
2. test_pages_in_a_zone() finds the zone according to stat_pfn = 0.
    The smallest pfn of the numa node in the mips architecture is 128,
    and the page corresponding to the previous 0~127 pfn is not
    initialized (page->flags is 0xFFFFFFFF)
3. The nid and zonenum obtained using page_zone(pfn_to_page(0)) are out
    of bounds in the corresponding array,
&NODE_DATA(page_to_nid(page))->node_zones[page_zonenum(page)],
    access to the out-of-bounds zone member variables appear abnormal,
    resulting in Oops.
Therefore, it is necessary to keep the page between 0 and the minimum
pfn to prevent Oops from appearing.

Signed-off-by: zhanglianjie <zhanglianjie@xxxxxxxxxxxxx>
---
  arch/mips/kernel/setup.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 23a140327a0b..f1da2b2ba5e9 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -653,6 +653,8 @@ static void __init arch_mem_init(char **cmdline_p)
       */
      memblock_set_current_limit(PFN_PHYS(max_low_pfn));

+    memblock_reserve(0, PAGE_SIZE * NODE_DATA(0)->node_start_pfn);
+

which platform needs this ? This look it should be better fixed in
the platform memory registration code.

Thomas.


I have a problem on the loogson platform.

I had checked a Loongson 3A4000 board (Lemote-A1901) with UEFI firmware and the region is reserved by firmware.

Hmm, you'd better contact vendor to fix the firmware. If it's not possible then workaround it in arch/mips/loongson64/numa.c

Thanks.

- Jiaxun





I will try to contact the manufacturer. However, the manufacturer cannot be contacted temporarily. I resubmitted a patch according to your method.
thank you very much for your help.

I want to ask, how do you check that the region is reserved by UEFI firmware?

The machine information I tested is as follows:
1. Lemote board
- hardware information:
Loongson 3A4000 board LEMOTE-LS3A4000-7A1000-1w-V01-pc.
- pagesize is 16k.
2. THTF board
- hardware information:
Loongson 3A4000 board THTF-LS3A4000-7A1000-1W-VB1-ML4A
- pagesize is 16k.



--
Regards,
Zhang Lianjie