From: Daniel Colascione
Date: Tue Aug 14 2018 - 16:37:19 EST

On Fri, Aug 10, 2018 at 3:52 PM, Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
> On Tue, Jul 31, 2018 at 02:36:39AM -0700, Daniel Colascione wrote:
>> > An API command name
>> > such as BPF_SYNCHRONIZE_MAP_TO_MAP_REFERENCES is simply non-generic, and
>> > exposes specific map details (here: map-in-map) into the UAPI whereas it
>> > should reside within a specific implementation instead similar to other ops
>> > we have for maps.
>> But synchronize isn't conceptually a command that applies to a
>> specific map. It waits on all references. Did you address my point
>> about your proposed map-specific interface requiring redundant
>> synchronize_rcu calls in the case where we swap multiple maps and want
>> to wait for all the references to drain? Under my proposal, you'd just
>> BPF_SYNCHRONIZE_WHATEVER and call schedule_rcu once. Under your
>> proposal, we'd make it a per-map operation, so we'd issue one
>> synchronize_rcu per map.
> optimizing for multi-map sync sounds like premature optimization.

Maybe, but the per-map proposal is less efficient *and* more
complicated! I don't want to spend more code just to go slower.

> I believe the only issue being discussed is user space doesn't know
> when it's ok to start draining the inner map when it was replaced
> by bpf_map_update syscall command with another map, right?


> If we agree on that, should bpf_map_update handle it then?
> Wouldn't it be much easier to understand and use from user pov?
> No new commands to learn. map_update syscall replaced the map
> and old map is no longer accessed by the program via this given map-in-map.

Maybe with a new BPF_SYNCHRONIZE flag for BPF_MAP_UPDATE_ELEM and
BPF_MAP_DELETE_ELEM. Otherwise, it seems wrong to make every user of
these commands pay for synchronization that only a few will need.

> But if the replaced map is used directly or it sits in some other
> map-in-map slot the progs can still access it.
> My issue with DanielC SYNC cmd that it exposes implementation details

What implementation details? The command semantics are defined
entirely in terms of existing user-visible primitives.

> and introduces complex 'synchronization' semantics. To majority of
> the users it won't be obvious what is being synchronized.
> My issue with DanielB WAIT_REF map_fd cmd that it needs to wait for all refs
> to this map to be dropped. I think combination of usercnt and refcnt
> can answer that, but feels dangerous to sleep potentially forever
> in a syscall until all prog->map references are gone, though such
> cmd is useful beyond map-in-map use case.

In what scenarios?

In any case, can we submit _something_?