Re: [linus:master] [eventfs] 852e46e239: BUG:unable_to_handle_page_fault_for_address
From: Steven Rostedt
Date: Tue Jan 30 2024 - 15:03:55 EST
On Tue, 30 Jan 2024 11:54:56 -0800
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, 30 Jan 2024 at 11:37, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> > Do you want me to put this in my urgent branch and mark them for stable,
> > and then send them for 6.8?
>
> Hmm. I think the only one that fixes a _reported_ bug is that [PTCH
> 2/6]. And it turns out that while 'ti->private' really is entirely
> uninitialized (and I still think it's the cause of the kernel test
> robot report that started this thread), the ti->flags field _is_
> initialized to zero in tracefs_alloc_inode().
>
> So even in that [PATCH 2/6], these parts:
>
> - ti->flags |= TRACEFS_EVENT_INODE;
> + ti->flags = TRACEFS_EVENT_INODE;
>
> aren't strictly needed (but aren't wrong either).
>
> The 'make sure to initialize ti->private before exposing the dentry'
> part *definitely* needs to be part of 6.8, though. That has an
> outstanding actually triggered bug report on it.
>
> I do think that tracefs_alloc_inode() should also initialize
> ti->private to NULL, but while that would fix the oops that the test
> robot reported, it wouldn't fix the data-race on any ti->private
> accesses.
>
> So that "ti->private = ei" needs to be done before the d_instantiate()
> (that later became a d_add()) regardless. But not having random fields
> left uninitialized for future subtle bugs would be a good idea too.
I'll add a patch to add __GFP_ZERO to the tracefs inode allocation, and
modify your patch 2 to just move the ti->private = ei;
>
> Anyway.
>
> If you do run the full tracefs tests on the whole series, and there
> are no other major problems, I'll happily take it all for 6.8. And
> yes, even mark it for stable. I think the other bugs are much harder
> to hit, but I do think they exist. And code deletion is always good.
Sounds good.
>
> So give it the full test attention, and *if* it all still looks good
> and there are no new subtle issues that crop up, let's just put this
> saga behind us asap.
Yes, I've been wanting to get away from eventfs for a month now.
Again, I really do appreciate the time you put in to fixing this for me.
I'm going to be backporting this to older Chromebooks as we really need to
cut down on the memory overhead of instances.
-- Steve