RE: [PATCH v1] init: avoid race condition of update page table in kernel init

From: Jianyong Wu
Date: Wed Oct 20 2021 - 07:55:06 EST




> -----Original Message-----
> From: David Hildenbrand <david@xxxxxxxxxx>
> Sent: Wednesday, October 20, 2021 5:05 PM
> To: Jianyong Wu <Jianyong.Wu@xxxxxxx>; akpm@xxxxxxxxxxxxxxxxxxxx;
> mhiramat@xxxxxxxxxx; peterz@xxxxxxxxxxxxx
> Cc: rostedt@xxxxxxxxxxx; vbabka@xxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> Anshuman Khandual <Anshuman.Khandual@xxxxxxx>; Justin He
> <Justin.He@xxxxxxx>; nd <nd@xxxxxxx>
> Subject: Re: [PATCH v1] init: avoid race condition of update page table in
> kernel init
>
> >> But why does it matter on arm64? Can you describe how the exact race
> >> triggers?
> >
> > I don't know much about how x86 does in memory map. Let me show you
> how the race happens on arm64.
> > When virtio-mem workqueue is triggered, arch_memory_add will be called
> where the related page table will be created. The call chain is
> arch_memory_add->__create_pgd_mapping->alloc_init_pud. As the memory
> add may take for serval seconds, it may be concurrent with mark_rodata_ro,
> in which page tables are created either.
>
> > The race can occur in alloc_init_pud. See below:
> >
> /***************************************************************
> ************/
> > Virtio-mem workqueue thread
> mark_rodata_ro thread
> > {
> > ...
> > pudp = pud_set_fixmap_offset(p4dp, addr); // set fixmap
> > do {
> > pud_t old_pud = READ_ONCE(*pudp);
> > ...
> > } while (pudp++, addr = next, addr != end); pudp =
> pud_set_fixmap_offset; //set fixmap
> > pud_clear_fixmap(); // clear fixmap do {
> > }
> pud_t old_pud = READ_ONCE(*pudp);//CRASH
> >
>
> I still don't quite understand how that race can even exist. I assume it's due
> to the weird semantics of the "fixmap". (whatever that is :) ) I don't see
> anything similar happen on other archs, especially x86-64 and s390x, which
> I'm familiar with.
>
> s390x similarly to x86-64 code uses a vmem_mutex to serialize add/remove
> in the direct map and a cpa_mutex to serialize attribute changes (and
> splitting of large mappings).
>

Yeah, I see that there is a spin lock when update page table in x86.
OK, let me poke Anshuman about this.
If this issue won't affect all arches, should we only fix it on arm64? @Anshuman Khandual

Thanks
Jianyong


> The right should be to teach arm64 mmu code that direct mapping updates
> might be concurrent, and that two instances might try messing with the
> fixmap concurrently.
>
>
> On a similar topic: I think we might want to reclaim compeltely empty page
> tables when unplugging memory; I suspect that we also have to mess with
> the fixmap then, whem removing page tables. But I feel like the whole fixmap
> machinery is still a big black box for me.
>

> --
> Thanks,
>
> David / dhildenb