Re: [RFC PATCH 2/2] livepatch: Clear relocation targets on a module removal

From: Petr Mladek
Date: Thu Sep 05 2019 - 07:10:04 EST


On Wed 2019-09-04 21:50:55, Josh Poimboeuf wrote:
> On Wed, Sep 04, 2019 at 10:49:32AM +0200, Petr Mladek wrote:
> > I wonder what is necessary for a productive discussion on Plumbers:
> >
> > + Josh would like to see what code can get removed when late
> > handling of modules gets removed. I think that it might be
> > partially visible from Joe's blue-sky patches.
>
> Yes, and I like what I see. Especially the removal of the .klp.arch
> nastiness!

Could we get rid of it?

Is there any other way to get access to static variables
and functions from the livepatched code?

> I think the .klp.arch sections are the big ones:
>
> .klp.arch.altinstructions
> .klp.arch.parainstructions
> .klp.arch.jump_labels (doesn't exist yet)
>
> And that's just x86...
>
> And then of course there's the klp coming/going notifiers which have
> also been an additional source of complexity.
>
> > + Do we use them in livepatches? How often?
>
> I don't have a number, but it's very common to patch a function which
> uses jump labels or alternatives.

Really? My impression is that both alternatives and jump_labels
are used in hot paths. I would expect them mostly in core code
that is always loaded.

Alternatives are often used in assembly that we are not able
to livepatch anyway.

Or are they spread widely via some macros or inlined functions?


> > + How often new problematic features appear?
>
> I'm not exactly sure what you mean, but it seems that anytime we add a
> new feature, we have to try to wrap our heads around how it interacts
> with the weirdness of late module patching.

I agree that we need to think about it and it makes complications.
Anyway, I think that these are never the biggest problems.

I would be more concerned about arch-specific features that might need
special handling in the livepatch code. Everyone talks only about
alternatives and jump_labels that were added long time ago.


> > Anyway, it might rule out some variants so that we could better
> > concentrate on the acceptable ones. Or come with yet another
> > proposal that would avoid the real blockers.
>
> I'd like to hear more specific negatives about Joe's recent patches,
> which IMO, are the best option we've discussed so far.

I discussed this approach with our project manager. He was not much
excited about this solution. His first idea was that it would block
attaching USB devices. They are used by admins when taking care of
the servers. And there might be other scenarios where a new module
might need loading to solve some situation.

Customers understand Livepatching as a way how to secure system
without immediate reboot and with minimal (invisible) effect
on the workload. They might get pretty surprised when the system
suddenly blocks their "normal" workflow.

As Miroslav said. No solution is perfect. We need to find the most
acceptable compromise. It seems that you are more concerned about
saving code, reducing complexity and risk. I am more concerned
about user satisfaction.

It is almost impossible to predict effects on user satisfaction
because they have different workflow, use case, expectation,
and tolerance.

We could better estimate the technical side of each solution:

+ implementation cost
+ maintenance cost
+ risks
+ possible improvements and hardening
+ user visible effects
+ complication and limits with creating livepatches


>From my POV, the most problematic is the arch-specific code.
It is hard to maintain and we do not have it fully under
control.

And I do not believe that we could remove all arch specific code
when we do not allow delayed livepatching of modules.

Best Regards,
Petr