RE: [PATCH V2 3/3] efi: Use efi_rts_workqueue to invoke EFI Runtime Services

From: Prakhya, Sai Praneeth
Date: Mon Mar 05 2018 - 19:56:35 EST


> > Presently, efi_runtime_services() are executed by firmware in process
> > context. To execute efi_runtime_service(), kernel switches the page
> > directory from swapper_pgd to efi_pgd. However, efi_pgd doesn't have
> > any user space mappings. A potential issue could be, for instance, an
> > NMI interrupt (like perf) trying to profile some user data while in efi_pgd.
> >
> > A solution for this issue could be to use kthread to run
> > efi_runtime_service(). When a user/kernel thread requests to execute
> > efi_runtime_service(), kernel off-loads this work to kthread which in
> > turn uses efi_pgd. Anything that tries to touch user space addresses
> > while in kthread is terminally broken. This patch adds support to efi
> > subsystem to handle all calls to efi_runtime_services() using a work
> > queue (which in turn uses kthread).
> >
> > Implementation summary:
> > -----------------------
> > 1. When user/kernel thread requests to execute efi_runtime_service(),
> > enqueue work to efi_rts_workqueue.
> > 2. Caller thread waits until the work is finished because it's
> > dependent on the return status of efi_runtime_service().
> >
> > Semantics to pack arguments in efi_runtime_work (has void pointers):
> > 1. If argument is a pointer (of any type), pass it as is.
> > 2. If argument is a value (of any type), address of the value is
> > passed.
> >
> > Introduce a handler function (called efi_call_rts()) that
> > a. understands efi_runtime_work and
> > b. invokes the appropriate efi_runtime_service() with the
> > appropriate arguments
> >
> > Semantics followed by efi_call_rts() to understand efi_runtime_work:
> > 1. If argument was a pointer, recast it from void pointer to original
> > pointer type.
> > 2. If argument was a value, recast it from void pointer to original
> > pointer type and dereference it.
> >
> > pstore writes could potentially be invoked in interrupt context and it
> > uses set_variable<>() and query_variable_info<>() to store logs. If we
> > invoke efi_runtime_services() through efi_rts_wq while in atomic()
> > kernel issues a warning ("scheduling wile in atomic") and prints stack
> > trace. One way to overcome this is to not make the caller process wait
> > for the worker thread to finish. This approach breaks pstore i.e. the
> > log messages aren't written to efi variables. Hence, pstore calls
> > efi_runtime_services() without using efi_rts_wq or in other words
> > efi_rts_wq will be used unconditionally for all the
> > efi_runtime_services() except set_variable<>() and
> > query_variable_info<>()
>
>
> Is there a place in the system reboot path where we can try to flush these
> asynchronous pstore writes from interrupt context?

I don't think so because, the issue is not with the pstore writes but with pstore
using efi as backing store. Anything could register as pstore backend, eg: RAM,
ACPI-ERST etc.. and AFAIK, they donât use work queues to store logs. Now that
efi_runtime_services() uses work queues, we unfortunately have to have this hack.

> It seems unfortunate that
> we need to have this wide exception for all
> set_variable() calls.

True, basically any efi_runtime_service() that might get called in interrupt context.
I am not very happy to have the hack too, but didnât find other way.

Either that or switch to an explicit "emergency mode" where
> we stop caring about protecting the system from EFI runtime code because
> we're already crashing.

Should we care about extra warning (scheduling while in atomic) when we are already
crashing? This sounds kind of debatable. I will wait for feedback from community it they
think it's OK or maybe a better solution.

Regards,
Sai