Re: [PATCH] kexec: Fix reboot race during device_shutdown()
From: Eric W. Biederman
Date: Mon Oct 09 2023 - 11:30:53 EST
Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> writes:
> On Mon, Oct 2, 2023 at 2:18 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> [..]
>> > > Such freezing is already being done if kernel supports KEXEC_JUMP and
>> > > kexec_image->preserve_context is true. However, doing it if either of these are
>> > > not true prevents crashes/races.
>> >
>> > The KEXEC_JUMP case is something else entirely. It is supposed to work
>> > like suspend to RAM. Maybe reboot should as well, but I am
>> > uncomfortable making a generic device fix kexec specific.
>>
>> I see your point of view. I think regular reboot should also be fixed
>> to avoid similar crash possibilities. I am happy to make a change for
>> that similar to this patch if we want to proceed that way.
>>
>> Thoughts?
>
> Just checking how we want to proceed, is the consensus that we should
> prevent kernel crashes without relying on userspace stopping all
> processes? Should we fix regular reboot syscall as well and not just
> kexec reboot?
It just occurred to me there is something very fishy about all of this.
What userspace do you have using kexec (not kexec on panic) that doesn't
preform the same userspace shutdown as a normal reboot?
Quite frankly such a userspace is buggy, and arguably that is where you
should start fixing things. That way you can get the orderly shutdown
of userspace daemons/services along with an orderly shutdown of
everything the kernel is responsible for.
At the kernel level a kexec reboot and a normal reboot have been
deliberately kept as close as possible. Which is why I say we should
fix it in reboot.
Eric