Re: [tip: sched/urgent] sched: Fix RANDSTRUCT build fail

From: Guenter Roeck
Date: Mon Jun 22 2020 - 10:37:06 EST


On 6/20/20 9:32 AM, Linus Torvalds wrote:
> On Fri, Jun 19, 2020 at 8:14 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>>
>> On Wed, Jun 10, 2020 at 10:34:16AM -0000, tip-bot2 for Peter Zijlstra wrote:
>>> The following commit has been merged into the sched/urgent branch of tip:
>>>
>>> Commit-ID: bfb9fbe0f7e70ec5c8e51ee55b6968d4dff14456
>>> Gitweb: https://git.kernel.org/tip/bfb9fbe0f7e70ec5c8e51ee55b6968d4dff14456
>>> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>> AuthorDate: Wed, 10 Jun 2020 12:14:09 +02:00
>>> Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>> CommitterDate: Wed, 10 Jun 2020 12:30:19 +02:00
>>>
>>> sched: Fix RANDSTRUCT build fail
>>>
>>> As a temporary build fix, the proper cleanup needs more work.
>>>
>>> Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx>
>>> Reported-by: Eric Biggers <ebiggers@xxxxxxxxxx>
>>> Suggested-by: Eric Biggers <ebiggers@xxxxxxxxxx>
>>> Suggested-by: Kees Cook <keescook@xxxxxxxxxxxx>
>>> Fixes: a148866489fb ("sched: Replace rq::wake_list")
>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>>
>> Hi, can this please get sent to Linus before -rc2? With a148866489fb in
>> -rc1, all the CI with the GCC plugins installed have been failing their
>> all*config builds. This entered -next 9 days ago (and fixed the -next
>> builds), but Linus's tree is still failing:
>
> Ugh.
>
> I actually think the problem goes deeper than that.
>
> The code expects the list entries to be of type 'call_single_data_t'
>
> Then they damn well should be that type.
>
> Note how "call_single_data_t" also implies certain alignment rules
> that the hack in 'struct task_struct' does *not* have, and while that
> doesn't matter on x86, it could matter on other architectures.
>

Yes, that came up before. At least, with this patch, I don't see any compile
or boot test failures. This patch wasn't supposed to be a clean solution
(Peter is working on that) but a quick fix that can be applied easily
and at least improves the situation.

As it is, the mainline kernel currently relies on having RANDSTRUCT
disabled, and still has all the other problems you mentioned here,
so it is definitely much worse.

On the other side, a148866489fb has been in the tree for more
than three weeks now. Maybe we can wait more time for a more
comprehensive solution which hopefully doesn't introduce other
problems.

Guenter

> So no, I don't think Peter's patch is correct. It may make the build
> pass, but that "check the offsets between two fields" is not
> sufficient.
>
> Now, if we could create a new
>
> struct __call_single_list_entry {
> struct llist_node llist;
> unsigned int flags;
> } call_single_list_entry_t;
>
> and use that as part of thecall_single_data_t and only use that for
> tyhe traversal of the list, then that would avoid the alignment issue
> and the waste of space in struct task_struct.
>
> Hmm?
>
> Peter/Ingo?
>
> Linus
>