Re: [PATCH] sched/headers: Fix sched_setattr userspace compilation breakage

From: Linus Torvalds
Date: Thu May 28 2020 - 22:18:03 EST


On Thu, May 28, 2020 at 6:45 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> glibc's <sched.h> already defines struct sched_param (which is a POSIX
> struct), so my inclusion of <linux/sched/types.h> above which is a UAPI
> header exported by the kernel, breaks because the following commit moved
> sched_param into the UAPI:
> e2d1e2aec572a ("sched/headers: Move various ABI definitions to <uapi/linux/sched/types.h>")
>
> Simply reverting that part of the patch also fixes it, like below. Would
> that be an acceptable fix? Then I can go patch glibc to get struct
> sched_attr by including the UAPI's <linux/sched/types.h>. Otherwise, I
> suspect glibc will also break if it tried to include the UAPI header.

Hmm.

Reverting that commit makes some sense as a "it broke things", and
yes, if this was some recent change that caused problems with user
headers, that would be what we should do (at least to then think about
it a bit more).

But that commit was done three years ago and you're the first person
to report breakage.

So for all I know, modern glibc source bases have already fixed
themselves up, and take advantage of the new UAPI location. Or they
just did that kernel header sync many years ago, and will fix it up
the next time they do a header sync.

So then reverting things (or adding the __KERNEL__ guard) would only
break _those_ cases instead and make for only more problems.

Basically, I think you should treat this as a glibc header bug, not a
kernel header bug.

And when you say

> The reason is, since <sched.h> did not provide struct sched_attr as the
> manpage said, so I did the include of uapi's linux/sched/types.h myself:

instead of starting to include the kernel uapi header files - that
interact at a deep level with those system header files - you should
just treat it as a glibc bug.

And then you can either work around it locally, or make a glibc
bug-report and hope it gets fixed that way.

The "work around it locally" might be something like a
"glibc-sched-h-fixup.h" header file that does

#ifndef SCHED_FIXUP_H
#define SCHED_FIXUP_H
#include <sched.h>

/* This is documented to come from <sched.h>, but doesn't */
struct sched_attr {
__u32 size;

__u32 sched_policy;
__u64 sched_flags;

/* SCHED_NORMAL, SCHED_BATCH */
__s32 sched_nice;

/* SCHED_FIFO, SCHED_RR */
__u32 sched_priority;

/* SCHED_DEADLINE */
__u64 sched_runtime;
__u64 sched_deadline;
__u64 sched_period;

/* Utilization hints */
__u32 sched_util_min;
__u32 sched_util_max;

};
#end /* SCHED_FIXUP_H */

in your build environment (possibly with configure magic etc to find
the need for this fixup, depending on how fancy you want to be).

Because when we have a change that is three+ years old, we can't
reasonably change the kernel back again without then likely just
breaking some other case that depends on that uapi file that has been
there for the last few years.

glibc and the kernel aren't developed in sync, so glibc generally
takes a snapshot of the kernel headers and then works with that. That
allows glibc developers to work around any issues they have with our
uapi headers (we've had lots of namespace issues, for example), but it
also means that the system headers aren't using some "generic kernel
UAPI headers". They are using a very _particular_ set of kernel uapi
headers from (likely) several years ago, and quite possibly then
further edited too.

Which is why you can't then mix glibc system headers that are years
old with kernel headers that are modern (or vice versa).

Well, with extreme luck and/or care you can. But not in general.

Linus