Re: usb/core: warning in usb_create_ep_devs/sysfs_create_dir_ns

From: Dmitry Vyukov
Date: Tue Dec 13 2016 - 15:34:08 EST


"On Tue, Dec 13, 2016 at 9:09 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, 13 Dec 2016, Dmitry Vyukov wrote:
>
>> On Tue, Dec 13, 2016 at 7:38 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>> > On Tue, 13 Dec 2016, Dmitry Vyukov wrote:
>> >
>> >> On Tue, Dec 13, 2016 at 4:52 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>> >> > On Tue, 13 Dec 2016, Dmitry Vyukov wrote:
>> >> >
>> >> >> >> > If it is
>> >> >> >> > not a bug in kernel source code, then it must not produce a WARNING.
>> >> >> >
>> >> >> > What about a memory allocation failure? The memory management part of
>> >> >> > the kernel produces a WARNING message if an allocation fails and the
>> >> >> > caller did not specify __GFP_NOWARN.
>> >> >> >
>> >> >> > There is no way for a driver to guarantee that a memory allocation
>> >> >> > request will succeed -- failure is always an option. But obviously
>> >> >> > memory allocation failures are not bugs in the kernel.
>> >> >> >
>> >> >> > Are you saying that mm/page_alloc.c:warn_alloc() should produce
>> >> >> > something other than a WARNING?
>> >> >>
>> >> >>
>> >> >> The main thing I am saying is that we absolutely need a way for a
>> >> >> human or a computer program to be able to determine if there is
>> >> >> anything wrong with kernel or not.
>> >> > Doesn't it also produce a WARNING under other circumstances?
>> >>
>> >> No.
>> >>
>> >> OOM is not a WARNING and is easily distinguishable from BUG/WARNING.
>> >
>> >> Memory allocator does not print WARNINGs on allocation failures.
>> >
>> > Do you count dev_warn the same as WARN or WARN_ON? What about dev_WARN
>> > or pr_warn() or printk(KERN_WARNING...)? Maybe we're not talking about
>> > the same messages.
>> >
>> > The USB subsystem has got tons of dev_warn() and dev_err() calls.
>> > Relatively few (if any) of them are for kernel bugs.
>>
>>
>> I grep for "WARNING:". It is not possible to understand what function
>> printed messages on console.
>> Here are my current regexps:
>> https://github.com/google/syzkaller/blob/master/report/report.go#L29
>
> Ah, okay.
>
> So the take-home message is that we should use WARN* or dev_WARN or
> related functions only when reporting an actual kernel bug, whereas in
> other circumstances we should avoid mentioning "WARNING" or "BUG" in
> log output. In addition, memory allocations where the size is given by
> the user (and not limited) should always use the __GFP_NOWARN flag.

Yes, such simple convention that allows to "grep" for bugs would be awesome!


> I can audit the parts of the USB stack that I'm familiar with for these
> things. Anything else? What about "ERROR"? Your regexps don't appear
> to search for that.

Well, I don't know. Are there messages that are prefixed with ERROR
and indicate kernel bugs?
I can't possibly manually inspect all generated kernel output. I add
errors on case-by-case basis. BUG: and WARNING: where obvious starting
points. Then I discovered "INFO:", "Unable to handle kernel paging
request", "general protection fault:", "Kernel panic", "kernel BUG",
"Kernel BUG", "divide error:", "invalid opcode:", "unreferenced
object" and "UBSAN:".

There are chances that there are some bugs with ERROR: that I am just missing.
For example my dmesg is full of:

[2536674.524238] vmwrite error: reg 2800 value ffffffffffffffff (err -255)
[2536674.524256] Call Trace:
[2536674.524260] [<ffffffff8172a4bb>] dump_stack+0x64/0x82
[2536674.524267] [<ffffffffa1988ce2>] vmwrite_error+0x2c/0x2e [kvm_intel]
[2536674.524272] [<ffffffffa197d83f>] vmcs_writel+0x1f/0x30 [kvm_intel]
[2536674.524278] [<ffffffffa19847e0>] free_nested.part.64+0x70/0x170
[kvm_intel]
[2536674.524283] [<ffffffffa1984963>] vmx_free_vcpu+0x33/0x70 [kvm_intel]
[2536674.524295] [<ffffffffa0eced24>] kvm_arch_vcpu_free+0x44/0x50 [kvm]
[2536674.524308] [<ffffffffa0ecf892>] kvm_arch_destroy_vm+0xf2/0x1f0 [kvm]
[2536674.524311] [<ffffffff810c9b3d>] ? synchronize_srcu+0x1d/0x20
[2536674.524323] [<ffffffffa0eb7f16>] kvm_put_kvm+0x106/0x1c0 [kvm]
[2536674.524334] [<ffffffffa0eb8031>] kvm_vm_release+0x21/0x30 [kvm]
[2536674.524337] [<ffffffff811c2f34>] __fput+0xe4/0x260
[2536674.524340] [<ffffffff811c30fe>] ____fput+0xe/0x10
[2536674.524343] [<ffffffff8108aa2c>] task_work_run+0xac/0xd0
[2536674.524346] [<ffffffff8106be10>] do_exit+0x2b0/0xa60
[2536674.524349] [<ffffffff8106c63f>] do_group_exit+0x3f/0xa0
[2536674.524352] [<ffffffff8107c740>] get_signal_to_deliver+0x1d0/0x6f0
[2536674.524356] [<ffffffff81014418>] do_signal+0x48/0xa10
[2536674.524359] [<ffffffff8161449e>] ? SYSC_sendto+0x17e/0x1c0
[2536674.524362] [<ffffffff811d49b0>] ? do_vfs_ioctl+0x2e0/0x4c0
[2536674.524365] [<ffffffff81014e49>] do_notify_resume+0x69/0xb0
[2536674.524368] [<ffffffff8173b31a>] int_signal+0x12/0x17

But I have no idea if it is a bug or not. It's kind of printed with
pr_err. But on the other hand it is not printed WARN or BUG. Who
knows?..