Re: [Intel-gfx] alderlake crashes (random memory corruption?) with 6.0 i915 / ucode related
From: Jani Nikula
Date: Mon Oct 17 2022 - 09:35:55 EST
On Mon, 17 Oct 2022, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
> Hi,
>
> On 10/17/22 13:19, Thorsten Leemhuis wrote:
>> CCing the regression mailing list, as it should be in the loop for all
>> regressions, as explained here:
>> https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
>
> Yes sorry about that I meant to Cc the regressions list, not you personally,
> but the auto-completion picked the wrong address-book entry
> (and I did not notice this).
>
>> On 17.10.22 12:48, Hans de Goede wrote:
>>> On 10/17/22 10:39, Jani Nikula wrote:
>>>> On Mon, 17 Oct 2022, Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> wrote:
>>>>> On Thu, 13 Oct 2022, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>>>>> With 6.0 the following WARN triggers:
>>>>>> drivers/gpu/drm/i915/display/intel_bios.c:477:
>>>>>>
>>>>>> drm_WARN(&i915->drm, min_size == 0,
>>>>>> "Block %d min_size is zero\n", section_id);
>>>>>
>>>>> What's the value of section_id that gets printed?
>>>>
>>>> I'm guessing this is [1] fixed by commit d3a7051841f0 ("drm/i915/bios:
>>>> Use hardcoded fp_timing size for generating LFP data pointers") in
>>>> v6.1-rc1.
>>>>
>>>> I don't think this is the root cause for your issues, but I wonder if
>>>> you could try v6.1-rc1 or drm-tip and see if we've fixed the other stuff
>>>> already too?
>>>
>>> 6.1-rc1 indeed does not trigger the drm_WARN and for now (couple of
>>> reboots, running for 5 minutes now) it seems stable. 6.0.0 usually
>>> crashed during boot (but not always).
>>>
>>> Do you think it would be worthwhile to try 6.0.0 with d3a7051841f0 ?
>
> So I have been trying 6.0.0 with d3a7051841f0 doing a whole bunch of
> reboots + general use and that seems stable, then I reverted it and
> the very first boot of the kernel with that broke again, so I'm
> pretty sure that d3a7051841f0 fixes things.
>
> So d3a7051841f0 seems to do more then just fix the WARN().
Wow, so I guess we do screw up the parsing royally then. :o
> So lets try to get d3a7051841f0 added to the official stable series
> ASAP (I just noticed that Mark Pearson from Lenovo has already added it
> to Fedora's 6.0.2 build.
I think I'd also pick d3a7051841f0^ i.e. both commits:
d3a7051841f0 ("drm/i915/bios: Use hardcoded fp_timing size for generating LFP data pointers")
4e78d6023c15 ("drm/i915/bios: Validate fp_timing terminator presence")
for stable.
BR,
Jani.
>
> Regards,
>
> Hans
>
--
Jani Nikula, Intel Open Source Graphics Center