Re: [RFC] Add BPF_SYNCHRONIZE bpf(2) command

From: Joel Fernandes
Date: Sat Jul 14 2018 - 14:18:49 EST


On Tue, Jul 10, 2018 at 08:40:19PM -0700, Alexei Starovoitov wrote:
[..]
> > The kernel program might do:
> >
> > =====
> > const int current_map_key = 1;
> > void *current_map = bpf_map_lookup_elem(outer_map, &current_map_key);
> >
> > int stats_key = 42;
> > uint64_t *stats_value = bpf_map_lookup_elem(current_map, &stats_key);
> > __sync_fetch_and_add(&stats_value, 1);
> > =====
> >
> > If a userspace does:
> >
> > 1. Write new fd to outer_map[1].
> > 2. Call BPF_SYNC_MAP_ACCESS.
> > 3. Start deleting everything in the old map.
> >
> > How can we guarantee that the __sync_fetch_and_add will not add to the
> > old map?
>
> without any changes to the kernel sys_membarrier will work.
> And that's what folks use already.
> BPF_SYNC_MAP_ACCESS implemented via synchronize_rcu() will work
> as well whether in the current implementation where rcu_lock/unlock
> is done outside of the program and in the future when
> rcu_lock/unlock are called by the program itself.

Cool Alexei and Lorenzo, sounds great to me. Daniel want to send a follow up
patch with BPF_SYNC_MAP_ACCESS changes then?

> > Will the verifier automatically
> > hold the RCU lock for as long as a pointer to an inner map is valid?
>
> the verifier will guarantee the equivalency of future explicit
> lock/unlock by the program vs current situation of implicit
> lock/unlock by the kernel.
> The verifier will track that bpf_map_lookup_elem() is done
> after rcu_lock and that the value returned by this helper is
> not accessed after rcu_unlock. Baby steps of dataflow analysis.

Nice!

By the way just curious I was briefly going through kernel/bpf/arraymap.c.
How are you protecting against load-store tearing of values of array map
updates/lookups?

For example, if userspace reads an array map at a particular index, while
another CPU is updating it, then userspace can read partial values /
half-updated values right? Since rcu_read_lock is in use, I was hoping to
find something like rcu_assign_pointer there to protect readers against
concurrent updates. Thanks for any clarification.

Regards,

- Joel