Re: Using union-find in BPF verifier (was: Enhance union-find with KUnit tests and optimization improvements)

From: Kuan-Wei Chiu
Date: Fri Oct 18 2024 - 20:08:06 EST


On Thu, Oct 17, 2024 at 03:10:50PM +0800, Shung-Hsi Yu wrote:
> Michal mentioned lib/union_find.c during a discussion. I think we may
> have a use for in BPF verifier (kernel/bpf/verifier.c) that could
> further simplify the code. Eduard (who wrote the code shown below)
> probably would have a better idea.
>
> On Mon, Oct 07, 2024 at 06:19:10AM GMT, Tejun Heo wrote:
> > On Mon, Oct 07, 2024 at 11:28:27PM +0800, Kuan-Wei Chiu wrote:
> > > This patch series adds KUnit tests for the union-find implementation
> > > and optimizes the path compression in the uf_find() function to achieve
> > > a lower tree height and improved efficiency. Additionally, it modifies
> > > uf_union() to return a boolean value indicating whether a merge
> > > occurred, enhancing the process of calculating the number of groups in
> > > the cgroup cpuset.
> >
> > I'm not necessarily against the patchset but this probably is becoming too
> > much polishing for something which is only used by cpuset in a pretty cold
> > path. It probably would be a good idea to concentrate on finding more use
> > cases.
>
> In BPF verifier we do the following to identify the outermost loop in a
> BPF program.
>
> static struct bpf_verifier_state *get_loop_entry(struct bpf_verifier_state *st)
> {
> struct bpf_verifier_state *topmost = st->loop_entry, *old;
>
> while (topmost && topmost->loop_entry && topmost != topmost->loop_entry)
> topmost = topmost->loop_entry;
>
> while (st && st->loop_entry != topmost) {
> old = st->loop_entry;
> st->loop_entry = topmost;
> st = old;
> }
> return topmost;
> }
>
> static void update_loop_entry(struct bpf_verifier_state *cur, struct bpf_verifier_state *hdr)
> {
> struct bpf_verifier_state *cur1, *hdr1;
>
> cur1 = get_loop_entry(cur) ?: cur;
> hdr1 = get_loop_entry(hdr) ?: hdr;
>
> if (hdr1->branches && hdr1->dfs_depth <= cur1->dfs_depth) {
> cur->loop_entry = hdr;
> hdr->used_as_loop_entry = true;
> }
> }
>
> Squinting a bit get_loop_entry() looks quite like uf_find() and
> update_loop_entry() looks quite link uf_union(). So perhaps we could get
> a straight-forward conversion here.
>
>From a quick glance, it seems that there are still some differences
between update_loop_entry() and uf_union(). If we want to use
lib/union_find.c, we would need a new union function to ensure the
merge order, i.e., ensuring that a.parent = b but not b.parent = a.

Additionally, used_as_loop_entry can be replaced with
uf_find(a) == a && a->rank != 1.

However, I'm not entirely sure if this would make the code easier to
understand compared to the current implementation.

Regards,
Kuan-Wei