Re: [PATCH 2/2] perf test: Test FASYNC with watermark wakeups.

From: Ian Rogers
Date: Fri Feb 23 2024 - 13:01:59 EST


On Fri, Feb 23, 2024 at 9:35 AM Kyle Huey <me@xxxxxxxxxxxx> wrote:
>
> On Thu, Feb 22, 2024 at 11:54 AM Arnaldo Carvalho de Melo
> <acme@xxxxxxxxxx> wrote:
> >
> > On Thu, Feb 22, 2024 at 09:55:36AM -0800, Kyle Huey wrote:
> > > On Wed, Feb 21, 2024 at 10:36 AM Ian Rogers <irogers@xxxxxxxxxx> wrote:
> > > > On Wed, Feb 21, 2024 at 9:52 AM Kyle Huey <me@xxxxxxxxxxxx> wrote:
> > > > > + if (fcntl(fd, F_SETFL, FASYNC)) {
> > > > > + pr_debug("failed F_SETFL FASYNC %d\n", errno);
> > > > > + goto cleanup;
> > > > > + }
> >
> > > > Thanks for the work! The perf tool and perf test should run on older
> > > > kernels ideally without failure. I'm assuming this would fail on an
> > > > older kernel. Could we make the return value skip in that case?
> >
> > > Ah, hmm, I wasn't aware of that. This would fail on an older kernel,
> > > yes. It's not possible to distinguish between an older kernel and a
> > > kernel where this fix broke (at least not without hardcoding in an
> > > expected good kernel version, which seems undesirable), so that would
> >
> > Take a look at:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/tools/perf/tests/sigtrap.c?id=650e0bde43f35bb675e87e30f679a57cfa22e0e5
> >
> > To see how introspecting using BTF can be used to figure out if internal
> > data structures have what is needed, or if functions with some specific
> > arguments are present, etc, for sigtrap we have, in the patch above:
> >
> > - TEST_ASSERT_EQUAL("unexpected sigtraps", ctx.signal_count, NUM_THREADS * ctx.iterate_on);
> > + expected_sigtraps = NUM_THREADS * ctx.iterate_on;
> > +
> > + if (ctx.signal_count < expected_sigtraps && kernel_with_sleepable_spinlocks()) {
> > + pr_debug("Expected %d sigtraps, got %d, running on a kernel with sleepable spinlocks.\n",
> > + expected_sigtraps, ctx.signal_count);
> > + pr_debug("See https://lore.kernel.org/all/e368f2c848d77fbc8d259f44e2055fe469c219cf.camel@xxxxxx/\n";);
> > + return TEST_SKIP;
> > + } else
> > + TEST_ASSERT_EQUAL("unexpected sigtraps", ctx.signal_count, expected_sigtraps);
> >
> > And then:
> >
> > +static bool kernel_with_sleepable_spinlocks(void)
> > +{
> > + const struct btf_member *member;
> > + const struct btf_type *type;
> > + const char *type_name;
> > + int id;
> > +
> > + if (!btf__available())
> > + return false;
> > +
> > + id = btf__find_by_name_kind(btf, "spinlock", BTF_KIND_STRUCT);
> > + if (id < 0)
> > + return false;
> > +
> > + // Only RT has a "lock" member for "struct spinlock"
> > + member = __btf_type__find_member_by_name(id, "lock");
> > + if (member == NULL)
> > + return false;
> > +
> > + // But check its type as well
> > + type = btf__type_by_id(btf, member->type);
> > + if (!type || !btf_is_struct(type))
> > + return false;
> > +
> > + type_name = btf__name_by_offset(btf, type->name_off);
> > + return type_name && !strcmp(type_name, "rt_mutex_base");
> > +}
> >
> > > mean the test would always return ok or skip, not ok or fail. Is that
> > > ok?
> >
> > It should return:
> >
> > Ok if the kernel has what is needed and the test passes
> > Skip if the test fails and the kernel doesn't have what is needed
> > FAIL if the test fails and the kernel HAS what is needed.
> >
> > 'perf test sigtrap' also checks for the presence of the feature it
> > requires:
> >
> > static bool attr_has_sigtrap(void)
> > {
> > int id;
> >
> > if (!btf__available()) {
> > /* should be an old kernel */
> > return false;
> > }
> >
> > id = btf__find_by_name_kind(btf, "perf_event_attr", BTF_KIND_STRUCT);
> > if (id < 0)
> > return false;
> >
> > return __btf_type__find_member_by_name(id, "sigtrap") != NULL;
> > }
> >
> > fd = sys_perf_event_open(&attr, 0, -1, -1, perf_event_open_cloexec_flag());
> > if (fd < 0) {
> > if (attr_has_sigtrap()) {
> > pr_debug("FAILED sys_perf_event_open(): %s\n",
> > str_error_r(errno, sbuf, sizeof(sbuf)));
> > } else {
> > pr_debug("perf_event_attr doesn't have sigtrap\n");
> > ret = TEST_SKIP;
> > }
> > goto out_restore_sigaction;
> > }
> >
> > - Arnaldo
>
> I think perhaps I'm barking up the wrong tree here. This seems like a
> ton of work just to write a regression test. Maybe I should be doing
> this in tools/testing/selftests instead?

The problem is detecting support for the feature in the kernel. The
BTF approach isn't that bad, a couple of finds, but I think in this
case there isn't anything to be found to indicate the feature is
present. I like the perf test as perf tests are a form of
documentation. Perhaps just using TEST_SKIP here (rather than
TEST_FAIL) is best and the skip_reason can be a presumed lack of
kernel support.

Thanks,
Ian

> - Kyle