Re: [syzbot] [kernel?] WARNING in audit_log_start
From: Thomas Gleixner
Date: Tue Sep 03 2024 - 17:08:04 EST
On Tue, Sep 03 2024 at 12:24, Kees Cook wrote:
> On Tue, Sep 03, 2024 at 03:22:17PM -0400, Paul Moore wrote:
>> > > might_alloc include/linux/sched/mm.h:337 [inline]
>> > > slab_pre_alloc_hook mm/slub.c:3987 [inline]
>> > > slab_alloc_node mm/slub.c:4065 [inline]
>> > > kmem_cache_alloc_noprof+0x5d/0x2a0 mm/slub.c:4092
>> > > audit_buffer_alloc kernel/audit.c:1790 [inline]
>> > > audit_log_start+0x15e/0xa30 kernel/audit.c:1912
>> > > audit_seccomp+0x63/0x1f0 kernel/auditsc.c:3007
>>
>> The audit_seccomp() function allocates an audit buffer using
>> GFP_KERNEL, which should be the source of the might_sleep. We can fix
>> that easily enough by moving to GFP_ATOMIC (either for just this code
>> path or all callers, need to check that), but I just want to confirm
>> that we can't sleep here? I haven't dug into the syscall code in a
>> while, so I don't recall all the details, but it seems odd to me that
>> we can't safely sleep here ...
>
> I had a similar question.. this is at syscall entry time. What is
> suddenly different here? We've been doing seccomp logging here for
> years...
Correct.
syscall_enter_from_user_mode() enables interrupts. At that point
preempt_count is 0. So after that the task can sleep and schedule.
Nothing in the call chain leading up to the allocation disables
preemption or interrupts.
>From the actual console log:
do not call blocking ops when !TASK_RUNNING; state=2 set at [<ffffffff81908f9e>] audit_log_start+0x37e/0xa30
I have no idea how that state would leak accross schedule_timeout().
Thanks,
tglx