Re: [PATCH] printk: don't unconditionally shortcut print_time()

From: Petr Mladek
Date: Mon Dec 03 2018 - 09:14:41 EST


On Fri 2018-11-30 23:33:39, Tetsuo Handa wrote:
> and (I can't find the reproducer but) sometimes crashes like
>
> [ 451.988242] BUG: pagefault on kernel address 0xffff8881337fa000 in non-whitelisted uaccess
> [ 451.990465] BUG: unable to handle kernel paging request at ffff8881337fa000
> [ 451.992254] PGD 3602067 P4D 3602067 PUD 3605067 PMD 13ff65067 PTE 800ffffecc805060
> [ 451.994066] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [ 451.995284] CPU: 0 PID: 8147 Comm: a.out Kdump: loaded Not tainted 4.20.0-rc4+ #537
> [ 451.997268] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
> [ 451.999857] RIP: 0010:copy_user_generic_unrolled+0x54/0xc0
> [ 452.001312] Code: 83 e2 3f c1 e9 06 74 4a 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c 8b 5e 18 4c 89 07 4c 89 4f 08 4c 89 57 10 4c 89 5f 18 4c 8b 46 20 <4c> 8b 4e 28 4c 8b 56 30 4c 8b 5e 38 4c 89 47 20 4c 89 4f 28 4c 89
> [ 452.006119] RSP: 0018:ffffc90002573e90 EFLAGS: 00010202
> [ 452.007230] RAX: 0000000000701060 RBX: 0000000000100000 RCX: 0000000000002f02
> [ 452.008792] RDX: 0000000000000000 RSI: ffff8881337f9fd4 RDI: 0000000000644fe0
> [ 452.010399] RBP: 0000000000601060 R08: 8b494e7401000000 R09: 00000b88858b4900
> [ 452.011916] R10: 0f02000000c980f6 R11: b4bb830000025d85 R12: ffff8881337b6054
> [ 452.013351] R13: 0000000000601060 R14: ffff8881337b5fa8 R15: 00000000000000ac
> [ 452.014807] FS: 00007fb3c9eec740(0000) GS:ffff88813b200000(0000) knlGS:0000000000000000
> [ 452.016931] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 452.018234] CR2: ffff8881337fa000 CR3: 0000000139153002 CR4: 00000000001606f0
> [ 452.020038] Call Trace:
> [ 452.020708] _copy_to_user+0x56/0x70
> [ 452.021580] do_syslog+0x47d/0x7b0
> [ 452.022359] ? syscall_trace_enter+0x1c3/0x360
> [ 452.023395] ? trace_hardirqs_on_thunk+0x1a/0x1c
> [ 452.024438] ? lockdep_hardirqs_off+0x73/0xe0
> [ 452.025433] __x64_sys_syslog+0x13/0x20
> [ 452.026363] do_syscall_64+0x4f/0x1f0
> [ 452.027235] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> > /* message fits into buffer, move forward */
> > syslog_idx = log_next(syslog_idx);
>
> But we need to make this "static bool time;" accessible from SYSLOG_ACTION_SIZE_UNREAD case
> because syslog_partial subtraction assumes that msg_print_text() used printk_time value
> as of syslog_print() updated syslog_partial.
>
> while (seq < log_next_seq) {
> struct printk_log *msg = log_from_idx(idx);
>
> error += msg_print_text(msg, true, NULL, 0);
> idx = log_next(idx);
> seq++;
> }
> error -= syslog_partial;
>

Great catch.

Well, IMHO, it does not explain the pagefault above. The copy_to_user()
could come either from syslog_print() or from syslog_print_all(). They
both have their own checks that prevent the buf overflow.

The code is tricky but it looks safe now. Is it possible that
the crash was with some old, still broken, code? Or did you
add some extra changes that you used to trigger a testing
scenario?

Best Regards,
Petr