[PATCH] drm: fix amdkfd use-after-free GP fault

From: Randy Dunlap
Date: Sat Nov 11 2017 - 01:16:25 EST


From: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>

Fix GP fault caused by dev_info() reference to a struct device*
after the device has been freed (use after free).
kfd_chardev_exit() frees the device so 'kfd_device' should not
be used after calling kfd_chardev_exit().

To reproduce, just load the module and then unload it.
Note that %RAX contains repeated 0x6b, which is use-after-free
poisoning.

[ 946.645809] calling kfd_module_init+0x0/0x1000 [amdkfd] @ 5785
[ 946.646025] CRAT table not found
[ 946.646027] Finished initializing topology ret=0
[ 946.646050] kfd kfd: Initialized module
[ 946.646058] initcall kfd_module_init+0x0/0x1000 [amdkfd] returned 0 after 233 usecs
[ 947.650189] general protection fault: 0000 [#1] PREEMPT SMP
[ 947.650192] Modules linked in: amdkfd(-) amd_iommu_v2 dw_hdmi cec rc_core mxm_wmi ttm dln2 gpio_max730x tps65218 lp3943 mcb crc4 fpga_mgr fpga_bridge fmc fuse ctr ccm af_packet nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT nf_reject_ipv4 iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack libcrc32c ip6table_filter ip6_tables x_tables coretemp hwmon intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel uvcvideo videobuf2_vmalloc aes_x86_64 videobuf2_memops hid_generic videobuf2_v4l2
[ 947.650224] crypto_simd snd_hda_codec_hdmi videobuf2_core usbmouse videodev snd_hda_codec_realtek glue_helper usbhid media hid snd_hda_codec_generic snd_hda_intel arc4 snd_hda_codec cryptd iwldvm sdhci_pci snd_hda_core sdhci mmc_core mac80211 snd_hwdep iTCO_wdt snd_pcm iTCO_vendor_support xhci_pci intel_cstate xhci_hcd i915 snd_seq snd_seq_device ehci_pci snd_timer toshiba_acpi ehci_hcd snd usbcore iwlwifi sparse_keymap e1000e cfg80211 input_leds ptp wmi sr_mod intel_uncore mei_me lpc_ich led_class cdrom usb_common pps_core mei joydev intel_rapl_perf mousedev evdev industrialio toshiba_bluetooth shpchp mac_hid rfkill soundcore serio_raw pcspkr toshiba_haps battery video thermal ac button sg autofs4 [last unloaded: radeon]
[ 947.650259] CPU: 3 PID: 5791 Comm: rmmod Not tainted 4.14.0-rc8 #4
[ 947.650260] Hardware name: TOSHIBA PORTEGE R835/Portable PC, BIOS Version 4.10 01/08/2013
[ 947.650262] task: ffff97144a3f2840 task.stack: ffffa51e409c4000
[ 947.650266] RIP: 0010:__dev_printk+0x29/0x90
[ 947.650267] RSP: 0018:ffffa51e409c7e48 EFLAGS: 00010202
[ 947.650269] RAX: 6b6b6b6b6b6b6b6b RBX: ffffffff97a579c3 RCX: 0000000100140013
[ 947.650270] RDX: ffffa51e409c7e78 RSI: ffff97139e360558 RDI: ffffffff97a579c3
[ 947.650271] RBP: ffffa51e409c7e68 R08: 6b6b6b6b6b6b6b6b R09: ffffa51e409c7e78
[ 947.650272] R10: ffff9714465c44b8 R11: ffff9714465c55e8 R12: 00007fff874111f7
[ 947.650273] R13: 0000000000000800 R14: 00000000006231c0 R15: 0000000000623010
[ 947.650275] FS: 00007fe8a109d700(0000) GS:ffff97144fac0000(0000) knlGS:0000000000000000
[ 947.650276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 947.650277] CR2: 000000000062cc88 CR3: 000000013fd43005 CR4: 00000000000606e0
[ 947.650279] Call Trace:
[ 947.650283] ? kobject_cleanup+0x75/0x170
[ 947.650284] _dev_info+0x57/0x60
[ 947.650288] ? kfree+0xf5/0x140
[ 947.650295] kfd_module_exit+0x37/0x39 [amdkfd]
[ 947.650299] SyS_delete_module+0x14d/0x260
[ 947.650302] ? exit_to_usermode_loop+0x60/0x87
[ 947.650305] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 947.650307] RIP: 0033:0x7fe8a0beff97
[ 947.650308] RSP: 002b:00007fff8740ffc8 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
[ 947.650310] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fe8a0beff97
[ 947.650311] RDX: 00007fe8a0c56920 RSI: 0000000000000800 RDI: 0000000000623228
[ 947.650312] RBP: 00000000006231c0 R08: 00007fe8a0ea3f20 R09: 00007fff8740ef41
[ 947.650313] R10: 000000002ef31b7d R11: 0000000000000202 R12: 00007fff8740efc0
[ 947.650314] R13: 0000000000000000 R14: 00000000006231c0 R15: 0000000000623010
[ 947.650316] Code: 00 00 55 49 89 d1 48 89 e5 53 48 89 fb 48 83 ec 18 48 85 f6 74 5f 4c 8b 46 50 4d 85 c0 74 2b 48 8b 86 88 00 00 00 48 85 c0 74 25 <48> 8b 08 0f be 7b 01 48 c7 c2 96 0a aa 97 31 c0 83 ef 30 e8 7f
[ 947.650339] RIP: __dev_printk+0x29/0x90 RSP: ffffa51e409c7e48
[ 947.650388] ---[ end trace c41965e147ae98ae ]---

Signed-off-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
Cc: Oded Gabbay <oded.gabbay@xxxxxxxxx>
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
---
drivers/gpu/drm/amd/amdkfd/kfd_module.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- lnx-414-rc8.orig/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ lnx-414-rc8/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -24,6 +24,7 @@
#include <linux/sched.h>
#include <linux/moduleparam.h>
#include <linux/device.h>
+#include <linux/printk.h>
#include "kfd_priv.h"

#define KFD_DRIVER_AUTHOR "AMD Inc. and others"
@@ -138,7 +139,7 @@ static void __exit kfd_module_exit(void)
kfd_topology_shutdown();
kfd_chardev_exit();
kfd_pasid_exit();
- dev_info(kfd_device, "Removed module\n");
+ pr_info("amdkfd: Removed module\n");
}

module_init(kfd_module_init);