Re: [BUG] rtlwifi: a crash in error handling code of rtl_pci_probe()

From: Larry Finger
Date: Tue May 14 2019 - 21:10:53 EST


On 5/14/19 8:07 AM, Jia-Ju Bai wrote:
In rtl_pci_probe(), when request_irq() in rtl_pci_intr_mode_legacy() in rtl_pci_intr_mode_decide() fails, a crash occurs.
The crash information is as follows:

[Â 108.271155] kasan: CONFIG_KASAN_INLINE enabled
[Â 108.271163] kasan: GPF could be caused by NULL-ptr deref or user memory access
......
[Â 108.271193] RIP: 0010:cfg80211_get_drvinfo+0xce/0x3b0 [cfg80211]
......
[Â 108.271235] Call Trace:
[Â 108.271245]Â ethtool_get_drvinfo+0x110/0x640
[Â 108.271255]Â ? cfg80211_get_chan_state+0x7e0/0x7e0 [cfg80211]
[Â 108.271261]Â ? ethtool_get_settings+0x340/0x340
[Â 108.271268]Â ? __read_once_size_nocheck.constprop.7+0x20/0x20
[Â 108.271279]Â ? kasan_check_write+0x14/0x20
[Â 108.271284]Â dev_ethtool+0x272d/0x4c20
[Â 108.271290]Â ? unwind_get_return_address+0x66/0xb0
[Â 108.271299]Â ? __save_stack_trace+0x92/0x100
[Â 108.271307]Â ? ethtool_get_rxnfc+0x3f0/0x3f0
[Â 108.271316]Â ? save_stack+0xa3/0xd0
[Â 108.271323]Â ? save_stack+0x43/0xd0
[Â 108.271331]Â ? ftrace_graph_ret_addr+0x2d/0x170
[Â 108.271338]Â ? ftrace_graph_ret_addr+0x2d/0x170
[Â 108.271346]Â ? ftrace_graph_ret_addr+0x2d/0x170
[Â 108.271354]Â ? update_stack_state+0x3b2/0x670
[Â 108.271361]Â ? update_stack_state+0x3b2/0x670
[Â 108.271370]Â ? __read_once_size_nocheck.constprop.7+0x20/0x20
[Â 108.271379]Â ? unwind_next_frame.part.5+0x19f/0xa60
[Â 108.271388]Â ? bpf_prog_kallsyms_find+0x3e/0x270
[Â 108.271396]Â ? is_bpf_text_address+0x1a/0x30
[Â 108.271408]Â ? kernel_text_address+0x11d/0x130
[Â 108.271416]Â ? __kernel_text_address+0x12/0x40
[Â 108.271423]Â ? unwind_get_return_address+0x66/0xb0
[Â 108.271431]Â ? __save_stack_trace+0x92/0x100
[Â 108.271440]Â ? save_stack+0xa3/0xd0
[Â 108.271448]Â ? udp_ioctl+0x35/0xe0
[Â 108.271457]Â ? inet_ioctl+0x100/0x320
[Â 108.271466]Â ? inet_stream_connect+0xb0/0xb0
[Â 108.271475]Â ? alloc_file+0x60/0x480
[Â 108.271483]Â ? alloc_file_pseudo+0x19d/0x270
[Â 108.271495]Â ? sock_alloc_file+0x51/0x170
[Â 108.271502]Â ? __sys_socket+0x12c/0x1f0
[Â 108.271510]Â ? __x64_sys_socket+0x78/0xb0
[Â 108.271520]Â ? do_syscall_64+0xb1/0x2e0
[Â 108.271529]Â ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Â 108.271538]Â ? kasan_check_read+0x11/0x20
[Â 108.271548]Â ? mutex_lock+0x8f/0xe0
[Â 108.271557]Â ? __mutex_lock_slowpath+0x20/0x20
[Â 108.271568]Â dev_ioctl+0x1fb/0xae0
[Â 108.271576]Â ? dev_ioctl+0x1fb/0xae0
[Â 108.271586]Â ? _copy_from_user+0x71/0xd0
[Â 108.271594]Â sock_do_ioctl+0x1e2/0x2f0
[Â 108.271602]Â ? kmem_cache_alloc+0xf9/0x250
[Â 108.271611]Â ? ___sys_recvmsg+0x5a0/0x5a0
[Â 108.271621]Â ? apparmor_file_alloc_security+0x128/0x7e0
[Â 108.271630]Â ? kasan_unpoison_shadow+0x35/0x50
[Â 108.271638]Â ? kasan_kmalloc+0xad/0xe0
[Â 108.271652]Â ? apparmor_file_alloc_security+0x128/0x7e0
[Â 108.271662]Â ? apparmor_file_alloc_security+0x269/0x7e0
[Â 108.271670]Â sock_ioctl+0x361/0x590
[Â 108.271678]Â ? sock_ioctl+0x361/0x590
[Â 108.271686]Â ? routing_ioctl+0x470/0x470
[Â 108.271695]Â ? kasan_check_write+0x14/0x20
[Â 108.271703]Â ? __mutex_init+0xba/0x130
[Â 108.271713]Â ? percpu_counter_add_batch+0xc7/0x120
[Â 108.271722]Â ? alloc_empty_file+0xae/0x150
[Â 108.271729]Â ? routing_ioctl+0x470/0x470
[Â 108.271738]Â do_vfs_ioctl+0x1ae/0xfe0
[Â 108.271745]Â ? do_vfs_ioctl+0x1ae/0xfe0
[Â 108.271754]Â ? alloc_file_pseudo+0x1ad/0x270
[Â 108.271762]Â ? ioctl_preallocate+0x1e0/0x1e0
[Â 108.271770]Â ? alloc_file+0x480/0x480
[Â 108.271778]Â ? kasan_check_read+0x11/0x20
[Â 108.271786]Â ? __fget+0x24d/0x320
[Â 108.271794]Â ? iterate_fd+0x180/0x180
[Â 108.271802]Â ? fd_install+0x52/0x60
[Â 108.271812]Â ? security_file_ioctl+0x8c/0xb0
[Â 108.271820]Â ksys_ioctl+0x99/0xb0
[Â 108.271829]Â __x64_sys_ioctl+0x78/0xb0
[Â 108.271839]Â do_syscall_64+0xb1/0x2e0
[Â 108.271857]Â ? prepare_exit_to_usermode+0xc8/0x160
[Â 108.271871]Â entry_SYSCALL_64_after_hwframe+0x44/0xa9
......

I checked the driver source code, but cannot find the reason, so I only report the crash...
Can somebody give an explanation about this crash?

This crash is triggered by a runtime fuzzing tool named FIZZER written by us.

Your backtrace does not include any references to rtlwifi routines, and I have no idea what FIZZER does, thus it is not possible for me to debug this. If the error situation that you state happens, the code should end up at label "fail3" in routine rtl_pci_probe(). Insert printk statements after every line of the following, and report the last good point before the error. It is certainly possible that something is being torn down that was never erected. The likelihood of failure of both MSI and legacy interrupts is not very likely, and we probably have never hit those conditions.

fail3:
pci_set_drvdata(pdev, NULL);
rtl_deinit_core(hw);

fail2:
if (rtlpriv->io.pci_mem_start != 0)
pci_iounmap(pdev, (void __iomem *)rtlpriv->io.pci_mem_start);

pci_release_regions(pdev);
complete(&rtlpriv->firmware_loading_complete);

fail1:
if (hw)
ieee80211_free_hw(hw);
pci_disable_device(pdev);

return err;

Larry