Re: [PATCH v16 00/15] arm64: MMU enabled kexec relocation

From: Catalin Marinas
Date: Thu Sep 16 2021 - 05:37:49 EST


On Thu, Aug 26, 2021 at 11:03:21AM -0400, Pavel Tatashin wrote:
> On Tue, Aug 24, 2021 at 2:06 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> > > Enable MMU during kexec relocation in order to improve reboot performance.
> > >
> > > If kexec functionality is used for a fast system update, with a minimal
> > > downtime, the relocation of kernel + initramfs takes a significant portion
> > > of reboot.
> > >
> > > The reason for slow relocation is because it is done without MMU, and thus
> > > not benefiting from D-Cache.
> >
> > The performance improvements are indeed significant on some platforms
> > (going from 7s to ~40ms), so I think the merging the series is worth it.
> > Some general questions so I better understand the impact:
> >
> > - Is the kdump path affected in any way? IIUC that doesn't need any
> > relocation but we should also make sure we don't create the additional
> > page table unnecessarily (should keep as much memory intact as
> > possible). Maybe that's already handled.
>
> Because kdump does not need relocation, we do not reserve pages for
> the page table in the kdump reboot case. In fact, with this series,
> kdump reboot becomes more straightforward as we skip the relocation
> function entirely, and jump directly into the crash kernel (or
> purgatory if kexec tools loaded them).
>
> > - What happens if trans_pgd_create_copy() fails to allocate memory. Does
> > it fall back to an MMU-off relocation?
>
> In case we are so low on memory that trans_pgd_create_copy() fails to
> allocate the linear map that uses the large pages (the size of the
> page table is tiny) the kexec fails during kexec load time (not during
> reboot time), as out of memory. The MMU enabled kexec reboot is always
> on, and we should not have several ways to do kexec reboot as it makes
> the kexec reboot unpredictable in terms of performance, and also prone
> to bugs by having a common MMU enabled path and less common path when
> we are low on memory which is never tested.

I think this makes sense, especially since it will fail during the kexec
load time rather than reboot.

I'm ok in principle with this series but I'd need to convince James
Morse to have a another look since he followed it more closely than me.
Could you please rebase it against 5.15-rc1?

Thanks.

--
Catalin