Re: [PATCH] ring-buffer: Do not read at &event->array[0] if it across the page

From: Tze-nan Wu (吳澤南)
Date: Thu Sep 07 2023 - 11:36:47 EST


On Tue, 2023-09-05 at 12:39 -0400, Steven Rostedt wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On Tue, 5 Sep 2023 22:12:45 +0800
> Tze-nan Wu <Tze-nan.Wu@xxxxxxxxxxxx> wrote:
>
> > While reading from the tracing/trace, the ftrace reader rarely
> encounters
> > a KASAN invalid access issue.
> > It is likely that the writer has disrupted the ring_buffer that the
> reader
> > is currently parsing. the kasan report is as below:
>
> What kernel are you using? There's been recent fixes that handle the
> race
> between reading the last entry and the writer adding to it. This
> looks like
> you may have hit that bug.
>

I am currently using kernel version 6.1.24, but I have applied the
patch for the ringbuffer.c file up to kernel version 6.1.52. However, I
am still able to reprduce the issue.

BTW, my decice have ARCH=ARM64 and PAGESIZE=4K

> >
> > [name:report&]BUG: KASAN: invalid-access in
> rb_iter_head_event+0x27c/0x3d0
> > [name:report&]Read of size 4 at addr 71ffff8111a18000 by task xxxx
> > [name:report_sw_tags&]Pointer tag: [71], memory tag: [0f]
> > [name:report&]
> > CPU: 2 PID: 380 Comm: xxxx
> > Call trace:
> > dump_backtrace+0x168/0x1b0
> > show_stack+0x2c/0x3c
> > dump_stack_lvl+0xa4/0xd4
> > print_report+0x268/0x9b0
> > kasan_report+0xdc/0x148
> > kasan_tag_mismatch+0x28/0x3c
> > __hwasan_tag_mismatch+0x2c/0x58
> > rb_event_length() [inline]
> > rb_iter_head_event+0x27c/0x3d0
> > ring_buffer_iter_peek+0x23c/0x6e0
> > __find_next_entry+0x1ac/0x3d8
> > s_next+0x1f0/0x310
> > seq_read_iter+0x4e8/0x77c
> > seq_read+0xf8/0x150
> > vfs_read+0x1a8/0x4cc
> >
> > In some edge cases, ftrace reader could access to an invalid
> address,
> > specifically when reading 4 bytes beyond the end of the currently
> page.
> > While issue happened, the dump of rb_iter_head_event is shown as
> below:
> >
> > in function rb_iter_head_event:
> > - iter->head = 0xFEC
> > - iter->next_event = 0xFEC
> > - commit = 0xFF0
> > - read_stamp = 0x2955AC46DB0
> > - page_stamp = 0x2955AC2439A
> > - iter->head_page->page = 0x71FFFF8111A17000
> > - iter->head_page->time_stamp = 0x2956A142267
> > - iter->head_page->page->commit = 0xFF0
> > - the content in iter->head_page->page
> > 0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C
> FFFFFFC0
> >
> > In rb_iter_head_event, reader will call rb_event_length with
> argument
> > (struct ring_buffer_event *event = 0x71FFFF8111A17FFC).
>
> As the commit is considered the end of buffer, it should never read
> past
> that, which the above appears to be doing. That's why I think it hit
> that
> race.
>

Same as my thought,
since every time the reader try to access the address in next page,
the below condition hold in rb_iter_head_event function:

if (iter->page_stamp != iter_head_page->page->time_stamp ||
commit > rb_page_commit(iter_head_page))
goto reset;

I observe the result by the debug patch provided below:

@@ -2378,6 +2378,19 @@ rb_iter_head_event(struct
ring_buffer_iter *iter)
commit = rb_page_commit(iter_head_page);
smp_rmb();
+ if ((iter->head >= 0xFECUL) && commit == 0xFF0UL) {
+ pr_info("rbdbg: cpu = %d, event = 0x%lx, iter-
>head = 0x%lx,\
+ commit = 0xFF0, type = 0x%x, ts = 0x%x,
array addr = 0x%lx\n",\
+ iter->cpu_buffer->cpu, (unsigned
long)event, iter->head,\
+ event->type_len, event->time_delta,
(unsigned long)(&(event->array[0])));
+ mdelay(500);
+ pr_info("rbdbg2: cpu = %d, event = 0x%lx, iter-
>head = 0x%lx,\
+ commit = 0xFF0, type = 0x%x, ts = 0x%x,
array addr = 0x%lx\n",\
+ iter->cpu_buffer->cpu, (unsigned
long)event, iter->head,\
+ event->type_len, event->time_delta,
(unsigned long)(&(event->array[0])));
+ if (iter->page_stamp != iter_head_page->page-
>time_stamp || commit > rb_page_commit(iter_head_page))
+ pr_info("rbdbg: writer corrupt
reader\n");
+ }
event = __rb_page_index(iter_head_page, iter->head);
length = rb_event_length(event);

Note that the mdelay(500) in the debug patch can reproduce the issue
easier with same test in my environmnet,
I am now able to reproduce the issue within 15 minutes if the debug
patch on.

> > Since the content data start at address 0x71FFFF8111A17FFC are
> 0xFFFFFFC0.
> > event->type will be interpret as 0x0, than the reader will try to
> get the
> > length by accessing event->array[0], which is an invalid address:
> > &event->array[0] = 0x71FFFF8111A18000
> >
> > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@xxxxxxxxxxxx>
> > ---
> > Following patch may not become a solution, it merely checks if the
> address
> > to be accessed is valid or not within the rb_event_length before
> access.
> > And not sure if there is any side-effect it can lead to.
> >
> > I am curious about what a better solution for this issue would look
> like.
> > Should we address the problem from the writer or the reader?
> >
> > Also I wonder if the cause of the issue is exactly as I suspected.
> > Any Suggestion will be appreciated.
>
> I guess this depends on if you have the fixes or not?
>

yes, I could try to pick the patches that only included in mainline but
not in kernel-6.1.52 for ring_buffer.c file,
and do the same test to see if the issue is still reproducible.

> >
> > Test below can reproduce the issue in 2 hours on kernel-6.1.24:
> > $cd /sys/kernel/tracing/
> > # make the reader and writer race more through resize the
> buffer to 8kb
> > $echo 8 > buffer_size_kn
> > # enable all events
> > $echo 1 > event/enable
> > # enable trace
> > $echo 1 > tracing_on
> >
> > # write and run a script that keep reading trace
> > $./read_trace.sh
> >
> > ``` read_trace.sh
> > while :
> > do
> > cat /sys/kernel/tracing/trace > /dev/null
> > done
> >
> > ```
>
> Thanks, I'll look at that when I finish debugging the eventfs bug.
>
> -- Steve

Also thank for your reply,

-- Tze-nan