Re: [RFC] Add BPF_SYNCHRONIZE bpf(2) command

From: Daniel Colascione
Date: Fri Jul 06 2018 - 23:23:29 EST


On Fri, Jul 6, 2018 at 7:54 PM, Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
> On Fri, Jul 06, 2018 at 06:56:16PM -0700, Daniel Colascione wrote:
>> BPF_SYNCHRONIZE waits for any BPF programs active at the time of
>> BPF_SYNCHRONIZE to complete, allowing userspace to ensure atomicity of
>> RCU data structure operations with respect to active programs. For
>> example, userspace can update a map->map entry to point to a new map,
>> use BPF_SYNCHRONIZE to wait for any BPF programs using the old map to
>> complete, and then drain the old map without fear that BPF programs
>> may still be updating it.
>>
>> Signed-off-by: Daniel Colascione <dancol@xxxxxxxxxx>
>> ---
>> include/uapi/linux/bpf.h | 1 +
>> kernel/bpf/syscall.c | 14 ++++++++++++++
>> 2 files changed, 15 insertions(+)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index b7db3261c62d..4365c50e8055 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -98,6 +98,7 @@ enum bpf_cmd {
>> BPF_BTF_LOAD,
>> BPF_BTF_GET_FD_BY_ID,
>> BPF_TASK_FD_QUERY,
>> + BPF_SYNCHRONIZE,
>> };
>>
>> enum bpf_map_type {
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index d10ecd78105f..60ec7811846e 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -2272,6 +2272,20 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>> if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
>> return -EPERM;
>>
>> + if (cmd == BPF_SYNCHRONIZE) {
>> + if (uattr != NULL || size != 0)
>> + return -EINVAL;
>> + err = security_bpf(cmd, NULL, 0);
>> + if (err < 0)
>> + return err;
>> + /* BPF programs are run with preempt disabled, so
>> + * synchronize_sched is sufficient even with
>> + * RCU_PREEMPT.
>> + */
>> + synchronize_sched();
>> + return 0;
>
> I don't think it's necessary. sys_membarrier() can do this already
> and some folks use it exactly for this use case.

True --- implementation-wise, today. The point of the patch is to sort
out contractual guarantees. membarrier isn't documented to have the
BPF_SYNCHRONIZE effect right now, although it does. If we want to
extend membarrier's contract, great, but that might make it more
expensive than necessary for callers who, well, just want a memory
barrier. It's worthwhile to have a dedicated BPF command for this
task, especially considering the simplicity of the implementation.