Re: [PATCH v2 4/4] livepatch/shadow: Add garbage collection of shadow variables

From: Josh Poimboeuf
Date: Tue Jan 31 2023 - 16:17:40 EST


On Tue, Jan 31, 2023 at 03:23:58PM +0100, Petr Mladek wrote:
> > > +
> > > void *klp_shadow_get(void *obj, struct klp_shadow_type *shadow_type);
> > > void *klp_shadow_alloc(void *obj, struct klp_shadow_type *shadow_type,
> > > size_t size, gfp_t gfp_flags, void *ctor_data);
> >
> > I find the type-based interface to be unnecessarily clunky and
> > confusing:
> >
> > - It obscures the fact that uniqueness is determined by ID, not by type.
> >
> > - It's weird to be passing around the type even after it has been
> > registered: e.g., why doesn't klp remember the ctor/dtor?
>
> ctor/dtor must be implemented in each livepatch. klp_shadow_register()
> would need to remember all registered implementations.
> klp_shadow_alloc/get/free would need to find and choose one.
> It would add an extra overhead and non-determines.

There's really no need to even "register" the constructor. It can
continue to just be passed as needed to the alloc functions.

(Side note, I don't see why klp_shadow_alloc() even needs a constructor
as it's typically used when its corresponding object gets allocated, so
its initialization doesn't need to be atomic. klp_shadow_get_or_alloc()
on the other hand, does need a constructor since it's used for attaching
to already-existing objects.)

The thing really complicating this whole mess of an API is the
destructor. The problem is that it can change when replacing
livepatches, so we can't just remember it in the registration.

At least at Red Hat, I don't think we've ever used a shadow destructor.
Its real world use seems somewhere between rare and non-existent.

I guess a destructor would be needed if the constructor (or some other
initialization) allocated additional memory associated with the shadow
variable. That data would need to be freed.

But in that case couldn't an additional shadow variable be used for the
additional memory? Then we could just get rid of this idea of the
destructor and this API could become much more straightforward.

Alternatively, each livepatch could just have an optional global
destructor which is called for every object which needs destructing. It
could then use the ID to decide what needs done (or pass it off to a
more specific destructor).

> > - I don't understand the exposed klp_shadow_{un}register() interfaces.
> > What's the point of forcing their use if there's no garbage
> > collection?
>
> I probably do not understand it correctly.
>
> Normal livepatch developers do not need to use klp_shadow_{un}register().
> They are called transparently in __klp_enable_patch()/klp_complete_transition()
> and klp_module_coming()/klp_cleanup_module_patches_limited().
>
> The main reason why they are exposed via include/linux/livepatch.h
> and EXPORT_SYMBOL_GPL() is the selftest lib/livepatch/test_klp_shadow_vars.c.
>
> Well, the selftest probably could be bundled into a livepatch to go
> around this.

In that case, at the very least they should be prefixed with underscores
and the function comment should make it clear they shouldn't be called
under normal usage.

> > - It's internally confusing in klp to have both 'type' and 'type_reg'.
>
> I do not like it much either.
>
> An idea. We could replace "bool registered" with "struct list_head
> register_node" in struct klp_shadow_type. Then we could use it
> for registration.
>
> All the registered types will be in a global list (klp_type_register).
> klp_shadow_unregister() would do the garbage collection when it
> removed the last entry with the given id from the global list.

Yeah, that does sound better (assuming we decide to keep this type
thing).

> > It seems like overkill to have both klp_shadow_hash and
> > klp_shadow_types. For example 'struct klp_shadow' could have a link to
> > its type and then klp_shadow_type_get_reg could iterate through the
> > klp_shadow_hash.
>
> I guess that there is a misunderstanding. We need two databases:
>
> + One for the registered shadow_types. I does the reference
> counting. It counts the number of livepatched objects
> (livepatches) that might the shadow type. It says _when_ the garbage
> collection must be done.
>
> + Second for existing shadow variables. It is needed for finding
> the shadow data. It says _what_ variables have to freed when
> the garbage collection is being proceed.

Yup, not sure what I was thinking, please ignore...

--
Josh