Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref

From: Christian KÃnig
Date: Mon Mar 07 2016 - 16:07:14 EST


Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman:
On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
The following patch to radeon_sa_bo_new that
went into 3.10.99

commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
Author: Nicolai Hähnle <nicolai.haehnle@xxxxxxx>
Date: Fri Feb 5 14:35:53 2016 -0500
drm/radeon: hold reference to fences in radeon_sa_bo_new
commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.

is triggering an Oops for me right when xscreensaver
first began doing 3D stuff. After reverting this
patch, xscreensaver has been happily running 3D stuff.

Mar 6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
Mar 6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
Mar 6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
Mar 6 18:00:43 sage kernel: Oops: 0002 [#1] SMP

Mar 6 18:00:43 sage kernel: Stack:
Mar 6 18:00:43 sage kernel: ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
Mar 6 18:00:43 sage kernel: ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
Mar 6 18:00:43 sage kernel: 00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
Mar 6 18:00:43 sage kernel: Call Trace:
Mar 6 18:00:43 sage kernel: [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
Mar 6 18:00:43 sage kernel: [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
Mar 6 18:00:43 sage kernel: [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
Mar 6 18:00:43 sage kernel: [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
Mar 6 18:00:43 sage kernel: [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
Mar 6 18:00:43 sage kernel: [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
Mar 6 18:00:43 sage kernel: [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
Mar 6 18:00:43 sage kernel: [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
Mar 6 18:00:43 sage kernel: [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
Mar 6 18:00:43 sage kernel: [<ffffffff8178210f>] tracesys+0xe1/0xe6

$ lspci | grep VGA
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]
Next time, please cc: the people responsible for that patch as well...

I can revert it, but maybe something else is going on here? Do you have
this same problem on 3.14, and 4.5-rc7?

Hi Greg,

yes that's an already known issue. Feel free to revert that one for now.

I got it on my TODO list to provide a fixed patch for older kernel, but that can take a while.

For the background Nicolais patch is correct, but assumes that radeon_fence_unref() can safely take NULL as the fence which is not the case for older kernels.

Regards,
Christian.


thanks,

greg k-h