Re: [RFC PATCH 0/2] livepatch: Add support for hybrid mode
From: Miroslav Benes
Date: Fri Jan 31 2025 - 08:18:34 EST
> >
> > + What exactly is meant by frequent replacements (busy loop?, once a minute?)
>
> The script:
>
> #!/bin/bash
> while true; do
> yum install -y ./kernel-livepatch-6.1.12-0.x86_64.rpm
> ./apply_livepatch_61.sh # it will sleep 5s
> yum erase -y kernel-livepatch-6.1.12-0.x86_64
> yum install -y ./kernel-livepatch-6.1.6-0.x86_64.rpm
> ./apply_livepatch_61.sh # it will sleep 5s
> done
A live patch application is a slowpath. It is expected not to run
frequently (in a relative sense). If you stress it like this, it is quite
expected that it will have an impact. Especially on a large busy system.
> >
> > > Other potential risks may also arise
> > > due to inconsistencies or race conditions during transitions.
> >
> > What inconsistencies and race conditions you have in mind, please?
>
> I have explained it at
> https://lore.kernel.org/live-patching/Z5DHQG4geRsuIflc@xxxxxxxxxxxxxxx/T/#m5058583fa64d95ef7ac9525a6a8af8ca865bf354
>
> klp_ftrace_handler
> if (unlikely(func->transition)) {
> WARN_ON_ONCE(patch_state == KLP_UNDEFINED);
> }
>
> Why is WARN_ON_ONCE() placed here? What issues have we encountered in the past
> that led to the decision to add this warning?
A safety measure for something which really should not happen.
> > The main advantage of the atomic replace is simplify the maintenance
> > and debugging.
>
> Is it worth the high overhead on production servers?
Yes, because the overhead once a live patch is applied is negligible.
> Can you provide examples of companies that use atomic replacement at
> scale in their production environments?
At least SUSE uses it as a solution for its customers. No many problems
have been reported since we started ~10 years ago.
Regards,
Miroslav