Re: [RFT v2] iommu/amd: use subsys_initcall() on amdv2 iommu

From: Oded Gabbay
Date: Mon Apr 18 2016 - 02:48:47 EST


On Wed, Apr 13, 2016 at 1:07 AM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx> wrote:
> On Mon, Apr 11, 2016 at 03:52:43PM +0200, Christian KÃnig wrote:
>> Am 11.04.2016 um 15:39 schrieb Oded Gabbay:
>> >On Mon, Apr 11, 2016 at 4:28 PM, Christian KÃnig
>> ><christian.koenig@xxxxxxx> wrote:
>> >>Am 09.04.2016 um 02:25 schrieb Luis R. Rodriguez:
>> >>>On Tue, Mar 29, 2016 at 10:41 AM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx> wrote:
>> >>>>We need to ensure amd iommu v2 initializes before
>> >>>>driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
>> >>>>to do this make its init routine a subsys_initcall() which
>> >>>>ensures its load init is called first than modules when
>> >>>>built-in.
>> >>>>
>> >>>>This reverts the old work around implemented through commit
>> >>>>1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
>> >>>>instead of making the dependency implicit by linker order this
>> >>>>makes the ordering requirement explicit through proper kernel
>> >>>>APIs.
>> >>>>
>> >>>>Cc: Oded Gabbay <oded.gabbay@xxxxxxx>
>> >>>>Cc: Christian KÃnig <christian.koenig@xxxxxxx>
>> >>>>Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxxxx>
>> >>
>> >>Sorry for not responding earlier. Just coming back to all the stuff on my TODO list.
>> >>
>> >>Patch is Acked-by: Christian KÃnig <christian.koenig@xxxxxxx>
>> >
>> >Christian,
>> >Just wanted to be sure if you tested this patch-set or not.
>>
>> I did NOT tested it. If AMD IOMMU requires something which will now
>> initialize after the IOMMU module we will obviously run into trouble
>> again.
>>
>> I assumed that the creator of the patch did some testing.
>
> Nope, hence [RTF] Request For Testing.
>
>> >I don't think it should be merged without testing. If you already
>> >tested it than fine. If not, I think I can do it in the next week or
>> >so (just came back from PTO).
>>
>> Yeah, agree totally.
>
> Agreed, please let me know if someone is able to test and confirm
> this works. It should work.
>
> Luis

Hi,
So I finally got to test this patch and it's not working.
The reason is that AMD IOMMUv2 gets initialized *before* AMD IOMMUv1 driver !
So, we get this from dmesg:

[ 0.439612] AMD IOMMUv2 driver by Joerg Roedel <jroedel@xxxxxxx>
[ 0.439614] AMD IOMMUv2 functionality not available on this system

later we get:
[ 1.084749] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 1.084750] AMD-Vi: Extended features: PPR GT IA PC
[ 1.084754] AMD-Vi: Interrupt remapping enabled
[ 1.085125] AMD-Vi: Lazy IO/TLB flushing enabled


And when I run some HSA samples, at the tear-down process stage we get this:

[ 7937.740306] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000020
[ 7937.740932] IP: [<ffffffff819661c2>] mutex_lock+0x12/0x30
[ 7937.741534] PGD 55a57067 PUD 55a56067 PMD 0
[ 7937.742127] Oops: 0002 [#1] SMP
[ 7937.742709] Modules linked in: xt_CHECKSUM iptable_mangle
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge
stp llc ebtable_filter ebtables ip6table_filter ip6_tables
snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi
snd_hda_intel snd_hda_codec kvm_amd kvm snd_hda_core snd_hwdep snd_seq
edac_mce_amd edac_core snd_seq_device irqbypass crct10dif_pclmul
crc32_pclmul ppdev pcspkr crc32c_intel snd_pcm fuse
ghash_clmulni_intel sp5100_tco acpi_cpufreq parport_pc i2c_piix4
snd_timer fam15h_power k10temp video tpm_infineon tpm_tis parport
shpchp snd tpm soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc
serio_raw uas usb_storage r8169 mii fjes
[ 7937.746867] CPU: 2 PID: 2260 Comm: kfdtest Not tainted 4.6.0-rc3 #5
[ 7937.747600] Hardware name: Gigabyte Technology Co., Ltd. To be
filled by O.E.M./F2A88XM-D3H, BIOS F5 01/09/2014
[ 7937.748363] task: ffff88006944d880 ti: ffff88005b734000 task.ti:
ffff88005b734000
[ 7937.749136] RIP: 0010:[<ffffffff819661c2>] [<ffffffff819661c2>]
mutex_lock+0x12/0x30
[ 7937.749918] RSP: 0018:ffff88005b737cc8 EFLAGS: 00010246
[ 7937.750695] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
[ 7937.751491] RDX: 0000000080000000 RSI: ffff880000000000 RDI: 0000000000000020
[ 7937.752286] RBP: ffff88005b737cd0 R08: 0000000000000002 R09: ffffffff815321e2
[ 7937.753084] R10: ffffea00016f71c0 R11: 0000000000000246 R12: ffff880035239680
[ 7937.753884] R13: 0000000000000000 R14: 0000000000000001 R15: ffff88005b737d10
[ 7937.754684] FS: 00007fe0bc68a740(0000) GS:ffff88006d500000(0000)
knlGS:0000000000000000
[ 7937.755504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7937.756328] CR2: 0000000000000020 CR3: 0000000057198000 CR4: 00000000000406e0
[ 7937.757168] Stack:
[ 7937.758004] 0000000000000000 ffff88005b737d80 ffffffff810bae2c
ffff88005b737d48
[ 7937.758872] 0000000000000020 ffff880000000000 0000000000000000
ffff88005b737d38
[ 7937.759737] ffff88005b737d38 ffff88005b737d10 ffff88005b737d10
ffff8800ffffffff
[ 7937.760606] Call Trace:
[ 7937.761474] [<ffffffff810bae2c>] flush_workqueue+0x9c/0x530
[ 7937.762348] [<ffffffff818107a4>] ? amd_iommu_domain_clear_gcr3+0x84/0xa0
[ 7937.763225] [<ffffffff818157d0>] mn_release+0x60/0x70
[ 7937.764107] [<ffffffff81210c64>] __mmu_notifier_release+0x44/0xc0
[ 7937.764998] [<ffffffff811efbca>] exit_mmap+0x15a/0x170
[ 7937.765889] [<ffffffff811e8b33>] ? handle_mm_fault+0x14e3/0x1d50
[ 7937.766784] [<ffffffff814b06d0>] ? n_tty_open+0xd0/0xd0
[ 7937.767677] [<ffffffff81124cac>] ? exit_robust_list+0x5c/0x110
[ 7937.768573] [<ffffffff810a199b>] mmput+0x5b/0x100
[ 7937.769463] [<ffffffff810a8566>] do_exit+0x276/0xb30
[ 7937.770349] [<ffffffff810684a5>] ? __do_page_fault+0x205/0x4d0
[ 7937.771226] [<ffffffff810a8ea7>] do_group_exit+0x47/0xb0
[ 7937.772090] [<ffffffff810a8f24>] SyS_exit_group+0x14/0x20
[ 7937.772941] [<ffffffff819684f2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[ 7937.773801] Code: 83 f8 01 0f 85 6d ff ff ff eb db e8 f9 e4 73 ff
66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8
1e e4 ff ff <f0> ff 0b 79 08 48 89 df e8 c1 fe ff ff 65 48 8b 04 25 c0
d2 00
[ 7937.775671] RIP [<ffffffff819661c2>] mutex_lock+0x12/0x30
[ 7937.776588] RSP <ffff88005b737cc8>
[ 7937.777499] CR2: 0000000000000020
[ 7937.781746] ---[ end trace 46aeb31f738c07ff ]---
[ 7937.781748] Fixing recursive fault but reboot is needed!