[DRM] drm_get_connector_name internal static string buffer

From: Dmitry Malkin
Date: Mon Mar 24 2014 - 10:37:32 EST


Hello guys,

I've been playing with reloading intel gfx driver (i915) in a cycle, for a while,
and at some point I've found a non-deterministic kernel crash with a highly-variable
iteration dependency -- 2 to 200 driver reload iterations.

The apparent race is over the shared internal string buffer in drm_get_connector_name().
It is mostly harmless, due to its results being mostly used for log output, but in at least one
case  -- drm_sysfs_connector_add() -- this leads to a more critical error.

Race scenario:

- drm_sysfs_connector_add()
   - drm_get_connector_name()
vs.
- anything that generates log messages involving DRM connectors
   - drm_get_connector_name()

 (and many other from drm_crtc.c) shares with caller const char* to internal static char buffer.
If something call it from other thread, while main thread strore and use returned pointer it may overwrite connector name.

Here are we go: registering HDMI connecter  (drm_sysfs_connector_add store and use pointer from drm_get_connector_name)
and the same time got VGA connector name down through the stack. (the second thread is upowerd who watch continuously sysfs)

Mar 24 14:23:04 haswell01 kernel: [  417.570043] ------------[ cut here ]------------
Mar 24 14:23:04 haswell01 kernel: [  417.570045] WARNING: CPU: 0 PID: 12700 at /build/buildd/linux-3.13.0/lib/kobject.c:223 kobject_add_internal+0x224/0x330()
Mar 24 14:23:04 haswell01 kernel: [  417.570046] kobject_add_internal failed for card0-VGA-1 with -EEXIST, don't try to register things with the same name in the same directory.
Mar 24 14:23:04 haswell01 kernel: [  417.570047] Modules linked in: i915(+) video drm_kms_helper drm i2c_algo_bit snd_hda_codec_realtek snd_hda_codec_hdmi bnep rfcomm bluetooth x86_pkg_temp_thermal intel_p
owerclamp coretemp kvm_intel snd_hda_codec kvm snd_hwdep snd_pcm hid_generic snd_page_alloc crct10dif_pclmul snd_seq_midi crc32_pclmul ghash_clmulni_intel snd_seq_midi_event usbhid snd_rawmidi hid aesni_in
tel aes_x86_64 lrw gf128mul ppdev glue_helper ablk_helper cryptd snd_seq snd_seq_device snd_timer snd mei_me psmouse lpc_ich soundcore mei mac_hid parport_pc serio_raw nls_iso8859_1 lp parport e1000e ahci
ptp libahci pps_core [last unloaded: video]
Mar 24 14:23:04 haswell01 kernel: [  417.570068] CPU: 0 PID: 12700 Comm: modprobe Tainted: G        W    3.13.0-19-generic #39-Ubuntu
Mar 24 14:23:04 haswell01 kernel: [  417.570069] Hardware name:                  /DQ87PG, BIOS PGQ8710H.86A.0144.2014.0113.1604 01/13/2014
Mar 24 14:23:04 haswell01 kernel: [  417.570069]  0000000000000009 ffff8804051295f8 ffffffff81711075 ffff880405129640
Mar 24 14:23:04 haswell01 kernel: [  417.570071]  ffff880405129630 ffffffff810662cd ffff88040776a410 00000000ffffffef
Mar 24 14:23:04 haswell01 kernel: [  417.570074]  0000000000000000 ffff8804048dcc10 ffff880407769000 ffff880405129690
Mar 24 14:23:04 haswell01 kernel: [  417.570076] Call Trace:
Mar 24 14:23:04 haswell01 kernel: [  417.570078]  [<ffffffff81711075>] dump_stack+0x45/0x56
Mar 24 14:23:04 haswell01 kernel: [  417.570080]  [<ffffffff810662cd>] warn_slowpath_common+0x7d/0xa0
Mar 24 14:23:04 haswell01 kernel: [  417.570081]  [<ffffffff8106633c>] warn_slowpath_fmt+0x4c/0x50
Mar 24 14:23:04 haswell01 kernel: [  417.570082]  [<ffffffff81230083>] ? sysfs_create_dir_ns+0x73/0xc0
Mar 24 14:23:04 haswell01 kernel: [  417.570084]  [<ffffffff8135b9a4>] kobject_add_internal+0x224/0x330
Mar 24 14:23:04 haswell01 kernel: [  417.570086]  [<ffffffff8135bed5>] kobject_add+0x65/0xb0
Mar 24 14:23:04 haswell01 kernel: [  417.570088]  [<ffffffff814874f5>] device_add+0x125/0x640
Mar 24 14:23:04 haswell01 kernel: [  417.570090]  [<ffffffff81487c20>] device_create_groups_vargs+0xe0/0x110
Mar 24 14:23:04 haswell01 kernel: [  417.570092]  [<ffffffff81487cb1>] device_create+0x41/0x50
Mar 24 14:23:04 haswell01 kernel: [  417.570097]  [<ffffffffa0137fc9>] drm_sysfs_connector_add+0x69/0x230 [drm]
Mar 24 14:23:04 haswell01 kernel: [  417.570110]  [<ffffffffa0549ca1>] intel_hdmi_init_connector+0x111/0x260 [i915]
Mar 24 14:23:04 haswell01 kernel: [  417.570119]  [<ffffffffa0541670>] intel_ddi_init+0x270/0x2a0 [i915]
Mar 24 14:23:04 haswell01 kernel: [  417.570130]  [<ffffffffa0533176>] intel_setup_outputs+0x4c6/0x750 [i915]
Mar 24 14:23:04 haswell01 kernel: [  417.570139]  [<ffffffffa05370f7>] intel_modeset_init+0x607/0x8f0 [i915]
Mar 24 14:23:04 haswell01 kernel: [  417.570147]  [<ffffffffa04f90a4>] i915_driver_load+0xbb4/0xe70 [i915]
Mar 24 14:23:04 haswell01 kernel: [  417.570153]  [<ffffffffa0134cd2>] drm_dev_register+0xa2/0x1e0 [drm]
Mar 24 14:23:04 haswell01 kernel: [  417.570158]  [<ffffffffa0136bc2>] drm_get_pci_dev+0x92/0x140 [drm]
Mar 24 14:23:04 haswell01 kernel: [  417.570166]  [<ffffffffa04f567c>] i915_pci_probe+0x3c/0x90 [i915]
Mar 24 14:23:04 haswell01 kernel: [  417.570168]  [<ffffffff8139e0e5>] local_pci_probe+0x45/0xa0
Mar 24 14:23:04 haswell01 kernel: [  417.570170]  [<ffffffff8139f385>] ? pci_match_device+0xc5/0xd0
Mar 24 14:23:04 haswell01 kernel: [  417.570172]  [<ffffffff8139f4a9>] pci_device_probe+0xd9/0x130
Mar 24 14:23:04 haswell01 kernel: [  417.570174]  [<ffffffff8148a7c5>] driver_probe_device+0x125/0x3b0
Mar 24 14:23:04 haswell01 kernel: [  417.570176]  [<ffffffff8148ab23>] __driver_attach+0x93/0xa0
Mar 24 14:23:04 haswell01 kernel: [  417.570178]  [<ffffffff8148aa90>] ? __device_attach+0x40/0x40
Mar 24 14:23:04 haswell01 kernel: [  417.570179]  [<ffffffff81488733>] bus_for_each_dev+0x63/0xa0
Mar 24 14:23:04 haswell01 kernel: [  417.570181]  [<ffffffff8148a17e>] driver_attach+0x1e/0x20
Mar 24 14:23:04 haswell01 kernel: [  417.570183]  [<ffffffff81489d60>] bus_add_driver+0x180/0x250
Mar 24 14:23:04 haswell01 kernel: [  417.570185]  [<ffffffffa01be000>] ? 0xffffffffa01bdfff
Mar 24 14:23:04 haswell01 kernel: [  417.570187]  [<ffffffff8148b1a4>] driver_register+0x64/0xf0
Mar 24 14:23:04 haswell01 kernel: [  417.570189]  [<ffffffffa01be000>] ? 0xffffffffa01bdfff
Mar 24 14:23:04 haswell01 kernel: [  417.570191]  [<ffffffff8139da7c>] __pci_register_driver+0x4c/0x50
Mar 24 14:23:04 haswell01 kernel: [  417.570196]  [<ffffffffa0136d8a>] drm_pci_init+0x11a/0x130 [drm]
Mar 24 14:23:04 haswell01 kernel: [  417.570198]  [<ffffffffa01be000>] ? 0xffffffffa01bdfff
Mar 24 14:23:04 haswell01 kernel: [  417.570205]  [<ffffffffa01be066>] i915_init+0x66/0x68 [i915]
Mar 24 14:23:04 haswell01 kernel: [  417.570207]  [<ffffffff8100214a>] do_one_initcall+0xfa/0x1b0
Mar 24 14:23:04 haswell01 kernel: [  417.570208]  [<ffffffff81058ae3>] ? set_memory_nx+0x43/0x50
Mar 24 14:23:04 haswell01 kernel: [  417.570211]  [<ffffffff810e091d>] load_module+0x12dd/0x1b40
Mar 24 14:23:04 haswell01 kernel: [  417.570213]  [<ffffffff810dc3a0>] ? store_uevent+0x40/0x40
Mar 24 14:23:04 haswell01 kernel: [  417.570215]  [<ffffffff810e12f6>] SyS_finit_module+0x86/0xb0
Mar 24 14:23:04 haswell01 kernel: [  417.570217]  [<ffffffff81721c7f>] tracesys+0xe1/0xe6
Mar 24 14:23:04 haswell01 kernel: [  417.570219] ---[ end trace 8cd466c13137554f ]---

How to reproduce: load/reload i915 driver many times (in my case it happens after 2-200 attempts)
and you will got a sysfs dup warning and then while unloading driver it will crash (because of malformed connectors list):


Mar 24 14:23:16 haswell01 kernel: [  429.326174] BUG: unable to handle kernel NULL pointer dereference at 000000000000002f
Mar 24 14:23:16 haswell01 kernel: [  429.326177] IP: [<ffffffff8122de46>] sysfs_remove_file_ns+0x6/0x20
Mar 24 14:23:16 haswell01 kernel: [  429.326184] PGD 3f588d067 PUD 406361067 PMD 0
Mar 24 14:23:16 haswell01 kernel: [  429.326187] Oops: 0000 [#1] SMP
Mar 24 14:23:16 haswell01 kernel: [  429.326189] Modules linked in: i915 video drm_kms_helper drm i2c_algo_bit snd_hda_codec_realtek snd_hda_codec_hdmi bnep rfcomm bluetooth x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec kvm snd_hwdep snd_pcm hid_generic snd_page_alloc crct10dif_pclmul snd_seq_midi crc32_pclmul ghash_clmulni_intel snd_seq_midi_event usbhid snd_rawmidi hid aesni_intel aes_x86_64 lrw gf128mul ppdev glue_helper ablk_helper cryptd snd_seq snd_seq_device snd_timer snd mei_me psmouse lpc_ich soundcore mei mac_hid parport_pc serio_raw nls_iso8859_1 lp parport e1000e ahci ptp libahci pps_core [last unloaded: video]
Mar 24 14:23:16 haswell01 kernel: [  429.326219] CPU: 0 PID: 13302 Comm: dd Tainted: G        W    3.13.0-19-generic #39-Ubuntu
Mar 24 14:23:16 haswell01 kernel: [  429.326221] Hardware name:                  /DQ87PG, BIOS PGQ8710H.86A.0144.2014.0113.1604 01/13/2014
Mar 24 14:23:16 haswell01 kernel: [  429.326222] task: ffff8803f5aaafe0 ti: ffff880404d20000 task.ti: ffff880404d20000
Mar 24 14:23:16 haswell01 kernel: [  429.326224] RIP: 0010:[<ffffffff8122de46>]  [<ffffffff8122de46>] sysfs_remove_file_ns+0x6/0x20
Mar 24 14:23:16 haswell01 kernel: [  429.326227] RSP: 0018:ffff880404d21d20  EFLAGS: 00010246
Mar 24 14:23:16 haswell01 kernel: [  429.326228] RAX: 0000000000000000 RBX: ffffffffa0160160 RCX: ffffffffa0159356
Mar 24 14:23:16 haswell01 kernel: [  429.326230] RDX: 0000000000000000 RSI: ffffffffa0160140 RDI: ffffffffffffffff
Mar 24 14:23:16 haswell01 kernel: [  429.326231] RBP: ffff880404d21d30 R08: ffffffffa0161120 R09: 000000000000fffe
Mar 24 14:23:16 haswell01 kernel: [  429.326232] R10: 0000000000000000 R11: ffffea0010199f80 R12: ffff880407769000
Mar 24 14:23:16 haswell01 kernel: [  429.326234] R13: ffff880035d46af8 R14: ffff880035d46820 R15: ffffffffffffffed
Mar 24 14:23:16 haswell01 kernel: [  429.326236] FS:  00007f472f4ec740(0000) GS:ffff88041ea00000(0000) knlGS:0000000000000000
Mar 24 14:23:16 haswell01 kernel: [  429.326238] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 24 14:23:16 haswell01 kernel: [  429.326239] CR2: 000000000000002f CR3: 00000004050d3000 CR4: 00000000001407f0
Mar 24 14:23:16 haswell01 kernel: [  429.326241] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 24 14:23:16 haswell01 kernel: [  429.326242] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar 24 14:23:16 haswell01 kernel: [  429.326243] Stack:
Mar 24 14:23:16 haswell01 kernel: [  429.326244]  ffff880404d21d30 ffffffff81485a59 ffff880404d21d50 ffffffffa0137eb7
Mar 24 14:23:16 haswell01 kernel: [  429.326247]  ffff880407769000 ffff880035d46800 ffff880404d21d80 ffffffffa05381d0
Mar 24 14:23:16 haswell01 kernel: [  429.326250]  ffff8803f24e4000 ffff880035d46800 ffff8804081b9000 000000000000000c
Mar 24 14:23:16 haswell01 kernel: [  429.326253] Call Trace:
Mar 24 14:23:16 haswell01 kernel: [  429.326257]  [<ffffffff81485a59>] ? device_remove_file+0x19/0x20
Mar 24 14:23:16 haswell01 kernel: [  429.326267]  [<ffffffffa0137eb7>] drm_sysfs_connector_remove+0x57/0x90 [drm]
Mar 24 14:23:16 haswell01 kernel: [  429.326282]  [<ffffffffa05381d0>] intel_modeset_cleanup+0xd0/0x100 [i915]
Mar 24 14:23:16 haswell01 kernel: [  429.326288]  [<ffffffffa04f95f0>] i915_driver_unload+0x290/0x340 [i915]
Mar 24 14:23:16 haswell01 kernel: [  429.326294]  [<ffffffffa01346fc>] drm_dev_unregister+0x2c/0xe0 [drm]
Mar 24 14:23:16 haswell01 kernel: [  429.326299]  [<ffffffffa01347eb>] drm_put_dev+0x3b/0x70 [drm]
Mar 24 14:23:16 haswell01 kernel: [  429.326304]  [<ffffffffa04f558d>] i915_pci_remove+0x1d/0x20 [i915]
Mar 24 14:23:16 haswell01 kernel: [  429.326307]  [<ffffffff8139efbb>] pci_device_remove+0x3b/0xb0
Mar 24 14:23:16 haswell01 kernel: [  429.326310]  [<ffffffff8148a24f>] __device_release_driver+0x7f/0xf0
Mar 24 14:23:16 haswell01 kernel: [  429.326313]  [<ffffffff8148a2e3>] device_release_driver+0x23/0x30
Mar 24 14:23:16 haswell01 kernel: [  429.326315]  [<ffffffff8148905d>] unbind_store+0xbd/0xe0
Mar 24 14:23:16 haswell01 kernel: [  429.326317]  [<ffffffff81488484>] drv_attr_store+0x24/0x40
Mar 24 14:23:16 haswell01 kernel: [  429.326320]  [<ffffffff8122e698>] sysfs_write_file+0x128/0x1c0
Mar 24 14:23:16 haswell01 kernel: [  429.326323]  [<ffffffff811b88c4>] vfs_write+0xb4/0x1f0
Mar 24 14:23:16 haswell01 kernel: [  429.326325]  [<ffffffff811b92f9>] SyS_write+0x49/0xa0
Mar 24 14:23:16 haswell01 kernel: [  429.326328]  [<ffffffff81721c7f>] tracesys+0xe1/0xe6
Mar 24 14:23:16 haswell01 kernel: [  429.326329] Code: 58 c7 81 e8 7d 98 4e 00 48 83 c4 50 89 d8 5b 41 5c 41 5d 5d c3 bb fe ff ff ff eb e0 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 <48> 8b 7f 30 48 8b 36 48 89 e5 e8 bb 22 00 00 5d c3 66 0f 1f 84
Mar 24 14:23:16 haswell01 kernel: [  429.326351] RIP  [<ffffffff8122de46>] sysfs_remove_file_ns+0x6/0x20
Mar 24 14:23:16 haswell01 kernel: [  429.326353]  RSP <ffff880404d21d20>
Mar 24 14:23:16 haswell01 kernel: [  429.326355] CR2: 000000000000002f
Mar 24 14:23:16 haswell01 kernel: [  429.326356] ---[ end trace 8cd466c131375550 ]-----
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/