Re: 5.15-rc1 i915 blank screen booting on ThinkPads

From: Jani Nikula
Date: Fri Sep 17 2021 - 18:53:01 EST


On Fri, 17 Sep 2021, Matthew Brost <matthew.brost@xxxxxxxxx> wrote:
> On Fri, Sep 17, 2021 at 02:26:48PM -0700, Hugh Dickins wrote:
>> On Thu, 16 Sep 2021, Jani Nikula wrote:
>> > On Thu, 16 Sep 2021, Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
>> > > On 16/09/2021 05:37, Hugh Dickins wrote:
>> > >> Two Lenovo ThinkPads, old T420s (2011), newer X1 Carbon 5th gen (2017):
>> > >> i915 working fine on both up to 5.14, but blank screens booting 5.15-rc1,
>> > >> kernel crashed in some way.
>> ...
>> > > Kernel logs with drm.debug=0xe, with the broken black screen state,
>> > > would probably answer a lot of questions if you could gather it from
>> > > both machines?
>> >
>> > And for that, I think it's best to file separate bugs at [1] and attach
>> > the logs there. It helps keep the info in one place. Thanks.
>> >
>> > BR,
>> > Jani.
>> >
>> > [1] https://gitlab.freedesktop.org/drm/intel/issues/new
>>
>> Thanks for the quick replies: but of course, getting kernel logs was
>> the difficult part, this being bootup, with just a blank screen, and
>> no logging to disk at this stage. I've never needed it before, but
>> netconsole to the rescue.
>>
>> Problem then obvious, both machines now working,
>> please let me skip the bug reports, here's a patch:
>>
>
> Thanks for finding / fixing this Hugh. I will post this patch in a way
> our CI system can understand.

Thanks indeed!

Matt, please get rid of the BUG_ON while at it, and make it a
WARN. Oopsing doesn't do anyone any good.

BR,
Jani.

>
> Matt
>
>> [PATCH] drm/i915: fix blank screen booting crashes
>>
>> 5.15-rc1 crashes with blank screen when booting up on two ThinkPads
>> using i915. Bisections converge convincingly, but arrive at different
>> and surprising "culprits", none of them the actual culprit.
>>
>> netconsole (with init_netconsole() hacked to call i915_init() when
>> logging has started, instead of by module_init()) tells the story:
>>
>> kernel BUG at drivers/gpu/drm/i915/i915_sw_fence.c:245!
>> with RSI: ffffffff814d408b pointing to sw_fence_dummy_notify().
>> I've been building with CONFIG_CC_OPTIMIZE_FOR_SIZE=y, and that
>> function needs to be 4-byte aligned.
>>
>> Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
>> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
>> ---
>>
>> drivers/gpu/drm/i915/gt/intel_context.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> --- a/drivers/gpu/drm/i915/gt/intel_context.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
>> @@ -362,6 +362,7 @@ static int __intel_context_active(struct
>> return 0;
>> }
>>
>> +__aligned(4) /* Respect the I915_SW_FENCE_MASK */
>> static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
>> enum i915_sw_fence_notify state)
>> {

--
Jani Nikula, Intel Open Source Graphics Center