Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.
Thomas, was there some progress wrt to fixing below regression? I might
have missed something, but from here it looks like this fall through the
cracks.
Makes me wonder if we should temporarily revert this for now to fix this
for rc7 and ensure things get at least one week of testing before the final.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
#regzbot poke
On 14.06.24 15:45, Kaplan, David wrote:
[AMD Official Use Only - AMD Internal Distribution Only]
-----Original Message-----No, I did not observe them prior to the broken commit.
From: Thomas Zimmermann <tzimmermann@xxxxxxx>
Sent: Wednesday, June 12, 2024 9:26 AM
To: Linux regressions mailing list <regressions@xxxxxxxxxxxxxxx>
Cc: Petkov, Borislav <Borislav.Petkov@xxxxxxx>;
zack.rusin@xxxxxxxxxxxx; dmitry.osipenko@xxxxxxxxxxxxx; Kaplan, David
<David.Kaplan@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>;
Dave Airlie <airlied@xxxxxxxxxx>; Maarten Lankhorst
<maarten.lankhorst@xxxxxxxxxxxxxxx>; Maxime Ripard
<mripard@xxxxxxxxxx>; LKML <linux-kernel@xxxxxxxxxxxxxxx>; ML dri-devel
<dri-devel@xxxxxxxxxxxxxxxxxxxxx>; spice-devel@xxxxxxxxxxxxxxxxxxxxx;
virtualization@xxxxxxxxxxxxxxx
Subject: Re: [REGRESSION] QXL display malfunction
Caution: This message originated from an External Source. Use proper
caution when opening attachments, clicking links, or responding.
Hi
Am 12.06.24 um 14:41 schrieb Linux regression tracking (Thorsten Leemhuis):
[CCing a few more people and lists that get_maintainers pointed outshutting the guest down.
for qxl]
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.
Thomas, from here it looks like this report that apparently is caused
by a change of yours that went into 6.10-rc1 (b33651a5c98dbd
("drm/qxl: Do not pin buffer objects for vmap")) fell through the
cracks. Or was progress made to resolve this and I just missed this?
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker'
hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
#regzbot poke
On 03.06.24 04:29, Kaplan, David wrote:
-----Original Message-----
From: Kaplan, David
Sent: Sunday, June 2, 2024 9:25 PM
To: tzimmermann@xxxxxxx; dmitry.osipenko@xxxxxxxxxxxxx; Koenig,
Christian <Christian.Koenig@xxxxxxx>; zach.rusin@xxxxxxxxxxxx
Cc: Petkov, Borislav <Borislav.Petkov@xxxxxxx>;
regressions@xxxxxxxxxxxxxx
Subject: [REGRESSION] QXL display malfunction
Hi,
I am running an Ubuntu 19.10 VM with a tip kernel using QXL video
and I've observed the VM graphics often malfunction after boot,
sometimes failing to load the Ubuntu desktop or even immediately
1When it does load, the guest dmesg log often contains errors like
[ 4.303586] [drm:drm_atomic_helper_commit_planes] *ERROR* head
1wrong: 65376256x16777216+0+0
[ 4.586883] [drm:drm_atomic_helper_commit_planes] *ERROR* head
1wrong: 65376256x16777216+0+0
[ 4.904036] [drm:drm_atomic_helper_commit_planes] *ERROR* head
I don't see how these messages are related. Did they already appear beforewrong: 65335296x16777216+0+0
the broken commit was there?
I would usually only see one.id in[ 5.374347] [drm:qxl_release_from_id_locked] *ERROR* failed to find
Is there only one such message in the log? Or multiple/frequent ones.release_idr
Could you provide a stack trace of what happens before?Here's the top of a backtrace when the error occurs:
#0 qxl_release_from_id_locked (qdev=qdev@entry=0xffff88810126e000, id=id@entry=262151)
at drivers/gpu/drm/qxl/qxl_release.c:373
#1 0xffffffff819f5b6a in qxl_garbage_collect (qdev=0xffff88810126e000)
at drivers/gpu/drm/qxl/qxl_cmd.c:222
#2 0xffffffff810e3aa8 in process_one_work (worker=worker@entry=0xffff888101680300,
work=0xffff88810126f340) at kernel/workqueue.c:3231
#3 0xffffffff810e6281 in process_scheduled_works (worker=<optimized out>)
at kernel/workqueue.c:3312
#4 worker_thread (__worker=0xffff888101680300) at kernel/workqueue.c:3393
We sometimes draw into the buffer object from the CPU. For accessing theYes
buffer object's pages from the CPU, only a vmap operation should be
necessary. It appears as if qxl also requires a pin. My guess is that the pin
inserts the buffer-object's host-side pages and the code around
qxl_release_from_id_locked() appears to be garbage-collecting them.
Hence without the pin, the GC complains about inconsistent state.
vmap"I bisected the issue down to "drm/qxl: Do not pin buffer objects for
Thanks for bisecting. Does it work if you revert that commit?(b33651a5c98dbd5a919219d8c129d0674ef74299).
Thanks --David Kaplan