Re: [PATCH] drm: fix amdkfd use-after-free GP fault
From: Oded Gabbay
Date: Tue Nov 28 2017 - 17:43:47 EST
It was sent to Dave Airle (drm maintainer) to be included in 4.15-rc2
or 4.15-rc3 (depends when Dave will send his drm fixes).
Oded
On Wed, Nov 29, 2017 at 12:41 AM, Randy Dunlap <rdunlap@xxxxxxxxxxxxx> wrote:
> On 11/13/2017 08:09 AM, Oded Gabbay wrote:
>> On Sat, Nov 11, 2017 at 8:16 AM, Randy Dunlap <rdunlap@xxxxxxxxxxxxx> wrote:
>>> From: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
>>>
>>> Fix GP fault caused by dev_info() reference to a struct device*
>>> after the device has been freed (use after free).
>>> kfd_chardev_exit() frees the device so 'kfd_device' should not
>>> be used after calling kfd_chardev_exit().
>>>
>>> To reproduce, just load the module and then unload it.
>>> Note that %RAX contains repeated 0x6b, which is use-after-free
>>> poisoning.
>>>
>>> [ 946.645809] calling kfd_module_init+0x0/0x1000 [amdkfd] @ 5785
>>> [ 946.646025] CRAT table not found
>>> [ 946.646027] Finished initializing topology ret=0
>>> [ 946.646050] kfd kfd: Initialized module
>>> [ 946.646058] initcall kfd_module_init+0x0/0x1000 [amdkfd] returned 0 after 233 usecs
>>> [ 947.650189] general protection fault: 0000 [#1] PREEMPT SMP
>>> [ 947.650192] Modules linked in: amdkfd(-) amd_iommu_v2 dw_hdmi cec rc_core mxm_wmi ttm dln2 gpio_max730x tps65218 lp3943 mcb crc4 fpga_mgr fpga_bridge fmc fuse ctr ccm af_packet nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT nf_reject_ipv4 iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack libcrc32c ip6table_filter ip6_tables x_tables coretemp hwmon intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel uvcvideo videobuf2_vmalloc aes_x86_64 videobuf2_memops hid_generic videobuf2_v4l2
>>> [ 947.650224] crypto_simd snd_hda_codec_hdmi videobuf2_core usbmouse videodev snd_hda_codec_realtek glue_helper usbhid media hid snd_hda_codec_generic snd_hda_intel arc4 snd_hda_codec cryptd iwldvm sdhci_pci snd_hda_core sdhci mmc_core mac80211 snd_hwdep iTCO_wdt snd_pcm iTCO_vendor_support xhci_pci intel_cstate xhci_hcd i915 snd_seq snd_seq_device ehci_pci snd_timer toshiba_acpi ehci_hcd snd usbcore iwlwifi sparse_keymap e1000e cfg80211 input_leds ptp wmi sr_mod intel_uncore mei_me lpc_ich led_class cdrom usb_common pps_core mei joydev intel_rapl_perf mousedev evdev industrialio toshiba_bluetooth shpchp mac_hid rfkill soundcore serio_raw pcspkr toshiba_haps battery video thermal ac button sg autofs4 [last unloaded: radeon]
>>> [ 947.650259] CPU: 3 PID: 5791 Comm: rmmod Not tainted 4.14.0-rc8 #4
>>> [ 947.650260] Hardware name: TOSHIBA PORTEGE R835/Portable PC, BIOS Version 4.10 01/08/2013
>>> [ 947.650262] task: ffff97144a3f2840 task.stack: ffffa51e409c4000
>>> [ 947.650266] RIP: 0010:__dev_printk+0x29/0x90
>>> [ 947.650267] RSP: 0018:ffffa51e409c7e48 EFLAGS: 00010202
>>> [ 947.650269] RAX: 6b6b6b6b6b6b6b6b RBX: ffffffff97a579c3 RCX: 0000000100140013
>>> [ 947.650270] RDX: ffffa51e409c7e78 RSI: ffff97139e360558 RDI: ffffffff97a579c3
>>> [ 947.650271] RBP: ffffa51e409c7e68 R08: 6b6b6b6b6b6b6b6b R09: ffffa51e409c7e78
>>> [ 947.650272] R10: ffff9714465c44b8 R11: ffff9714465c55e8 R12: 00007fff874111f7
>>> [ 947.650273] R13: 0000000000000800 R14: 00000000006231c0 R15: 0000000000623010
>>> [ 947.650275] FS: 00007fe8a109d700(0000) GS:ffff97144fac0000(0000) knlGS:0000000000000000
>>> [ 947.650276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 947.650277] CR2: 000000000062cc88 CR3: 000000013fd43005 CR4: 00000000000606e0
>>> [ 947.650279] Call Trace:
>>> [ 947.650283] ? kobject_cleanup+0x75/0x170
>>> [ 947.650284] _dev_info+0x57/0x60
>>> [ 947.650288] ? kfree+0xf5/0x140
>>> [ 947.650295] kfd_module_exit+0x37/0x39 [amdkfd]
>>> [ 947.650299] SyS_delete_module+0x14d/0x260
>>> [ 947.650302] ? exit_to_usermode_loop+0x60/0x87
>>> [ 947.650305] entry_SYSCALL_64_fastpath+0x1e/0xa9
>>> [ 947.650307] RIP: 0033:0x7fe8a0beff97
>>> [ 947.650308] RSP: 002b:00007fff8740ffc8 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
>>> [ 947.650310] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fe8a0beff97
>>> [ 947.650311] RDX: 00007fe8a0c56920 RSI: 0000000000000800 RDI: 0000000000623228
>>> [ 947.650312] RBP: 00000000006231c0 R08: 00007fe8a0ea3f20 R09: 00007fff8740ef41
>>> [ 947.650313] R10: 000000002ef31b7d R11: 0000000000000202 R12: 00007fff8740efc0
>>> [ 947.650314] R13: 0000000000000000 R14: 00000000006231c0 R15: 0000000000623010
>>> [ 947.650316] Code: 00 00 55 49 89 d1 48 89 e5 53 48 89 fb 48 83 ec 18 48 85 f6 74 5f 4c 8b 46 50 4d 85 c0 74 2b 48 8b 86 88 00 00 00 48 85 c0 74 25 <48> 8b 08 0f be 7b 01 48 c7 c2 96 0a aa 97 31 c0 83 ef 30 e8 7f
>>> [ 947.650339] RIP: __dev_printk+0x29/0x90 RSP: ffffa51e409c7e48
>>> [ 947.650388] ---[ end trace c41965e147ae98ae ]---
>>>
>>> Signed-off-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
>>> Cc: Oded Gabbay <oded.gabbay@xxxxxxxxx>
>>> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
>>> ---
>>> drivers/gpu/drm/amd/amdkfd/kfd_module.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> --- lnx-414-rc8.orig/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>> +++ lnx-414-rc8/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>> @@ -24,6 +24,7 @@
>>> #include <linux/sched.h>
>>> #include <linux/moduleparam.h>
>>> #include <linux/device.h>
>>> +#include <linux/printk.h>
>>> #include "kfd_priv.h"
>>>
>>> #define KFD_DRIVER_AUTHOR "AMD Inc. and others"
>>> @@ -138,7 +139,7 @@ static void __exit kfd_module_exit(void)
>>> kfd_topology_shutdown();
>>> kfd_chardev_exit();
>>> kfd_pasid_exit();
>>> - dev_info(kfd_device, "Removed module\n");
>>> + pr_info("amdkfd: Removed module\n");
>>> }
>>>
>>> module_init(kfd_module_init);
>>>
>>>
>>
>> Thanks!
>> Applied to -fixes
>
> Hi and ping.
>
> When do you plan to merge this fix?
>
> thanks,
> --
> ~Randy