Re: selftests/sched_ext: enq_last_no_enq_fails testcase fails

From: Tejun Heo
Date: Wed Oct 23 2024 - 17:46:43 EST


Hello,

On Wed, Oct 23, 2024 at 10:13:19PM +0530, Vishal Chourasia wrote:
...
> static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
> {
> ...
> ret = validate_ops(ops);
> if (ret)
> goto err_disable;
> ...
> err_disable:
> mutex_unlock(&scx_ops_enable_mutex);
> /*
> * Returning an error code here would not pass all the error information
> * to userspace. Record errno using scx_ops_error() for cases
> * scx_ops_error() wasn't already invoked and exit indicating success so
> * that the error is notified through ops.exit() with all the details.
> *
> * Flush scx_ops_disable_work to ensure that error is reported before
> * init completion.
> */
> scx_ops_error("scx_ops_enable() failed (%d)", ret);
> kthread_flush_work(&scx_ops_disable_work);
> return 0;
> }
>
> validate_ops() correctly reports the error, but err_disable path ultimately
> returns with a value of zero

Yeah, this is because the failure is now communicated through the scheduler
unload path which has richer error reporting. The exit is triggered
immediately but loading still succeeds. We need to update the test framework
to detect this failure mode too.

Thanks.

--
tejun