Re: [PATCH v2 0/7] KSTATE: a mechanism to migrate some part of the kernel state across kexec
From: Andrey Ryabinin
Date: Tue Mar 11 2025 - 08:26:16 EST
On Tue, Mar 11, 2025 at 3:28 AM Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
>
> Hi Andrey,
>
> On Mon, Mar 10, 2025 at 5:04 AM Andrey Ryabinin <arbn@xxxxxxxxxxxxxxx> wrote:
> > Each driver/subsystem has to solve this problem in their own way.
> > Also if we use fdt properties for individual fields, that might be wastefull
> > in terms of used memory, as these properties use strings as keys.
> >
> > While with KSTATE solves the same problem in more elegant way, with this:
> > struct kstate_description a_state = {
> > .name = "a_struct",
> > .version_id = 1,
> > .id = KSTATE_TEST_ID,
> > .state_list = LIST_HEAD_INIT(test_state.state_list),
> > .fields = (const struct kstate_field[]) {
> > KSTATE_BASE_TYPE(i, struct a, int),
> > KSTATE_BASE_TYPE(s, struct a, char [10]),
> > KSTATE_POINTER(p_ulong, struct a),
> > KSTATE_PAGE(page, struct a),
> > KSTATE_END_OF_LIST()
> > },
> > };
>
> Hmm, this still requires manual efforts to implement this, so potentially
> a lot of work given how many drivers we have in-tree.
>
We are not going to have every possible driver to be able to persist its state.
I think the main target is VFIO driver which also implies PCI/IOMMU.
Besides, we'll need to persist only some fields of the struct, not the
entire thing.
There is no way to automate such decisions, so there will be some
manual effort anyway.
> And those KSTATE_* stuffs look a lot similar to BTF:
> https://docs.kernel.org/bpf/btf.html
>
> So, any possibility to reuse BTF here?
Perhaps, but I don't see it right away. I'll think about it.
> Note, BTF is automatically generated by pahole, no manual effort is required.
Nothing will save us from manual efforts of what parts of data we want to save,
so there has to be some way to mark that data.
Also same C types may represent different kind of data, e.g.
we may have an address to some persistent data (in linear mapping)
stored as an 'unsigned long address'.
Because of KASLR we can't copy 'address' by value, we'll need to save
it as an offset from PAGE_OFFSET
and add PAGE_OFFSET of the new kernel on restore.