Re: [RFC] Add BPF_SYNCHRONIZE bpf(2) command

From: Lorenzo Colitti
Date: Tue Jul 10 2018 - 22:46:36 EST


On Wed, Jul 11, 2018 at 8:52 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> we need to make sure we have detailed description of BPF_SYNC_MAP_ACCESS
> in uapi/bpf.h, since I feel the confusion regarding its usage is starting already.
> This new cmd will only make sense for map-in-map type of maps.
> Expecting that BPF_SYNC_MAP_ACCESS is somehow implies the end of
> the program or doing some other map synchronization is not correct.
> Commit log of this patch got it right:
> """
> For example, userspace can update a map->map entry to point to a new map,
> use BPF_SYNCHRONIZE to wait for any BPF programs using the old map to
> complete, and then drain the old map without fear that BPF programs
> may still be updating it.
> """

+1 for detailed documentation. For example, consider what happens if
we have two map fds, one active and one standby, and a map-in-map with
one element that contains a pointer to the currently-active map fd.
The kernel program might do:

=====
const int current_map_key = 1;
void *current_map = bpf_map_lookup_elem(outer_map, &current_map_key);

int stats_key = 42;
uint64_t *stats_value = bpf_map_lookup_elem(current_map, &stats_key);
__sync_fetch_and_add(&stats_value, 1);
=====

If a userspace does:

1. Write new fd to outer_map[1].
2. Call BPF_SYNC_MAP_ACCESS.
3. Start deleting everything in the old map.

How can we guarantee that the __sync_fetch_and_add will not add to the
old map? If it does, we'll lose data. Will the verifier automatically
hold the RCU lock for as long as a pointer to an inner map is valid?