4.20.0-rc6-next-20181210, v4.20-rc1: list_del corruption on thinkpad x220, graphics related?

From: Pavel Machek
Date: Wed Dec 12 2018 - 13:29:13 EST


Hi!

> > > > > > > There's one similar for nouveau in Bugzilla, but it seems like a genuine
> > > > > > > memory corruption (1 bit flipped):
> > > > > > >
> > > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=84880
> > > > > > >
> > > > > > > Any extra information would be of use :)
> > > > > > >
> > > > > > > Regards, Joonas
> > > > > > >
> > > > > > > PS. Could you open a bug to Bugzilla, it'll help to collect the
> > > > > > > information in one consolidated place:
> > > > > > >
> > > > > > > https://01.org/linuxgraphics/documentation/how-report-bugs
> > > > > >
> > > > > > I prefer email... certainly for bugs that can't be reproduced.
> > > > >
> > > > > By adding it to the Bugzilla it may be recognized by somebody else
> > > > > who is experiencing a similar issue. Internet points are not deducted
> > > > > for submitting bugs in good faith, even if they get closed as
> > > > > NOTABUG.
> > >
> > > Well, your documentation suggests you'll deduce my internet points:
> > >
> > > Before filing the bug, please try to reproduce your issue with the
> > > latest kernel. Use the latest drm-tip branch from
> > > http://cgit.freedesktop.org/drm-tip and build as instructed on our
> > > Build Guide.
> > >
> > > :-)
> >
> > I'd prefer not to run drm-tip. I'll update to 2.6.20-rc5+ and see if
> > it re-appears (but it takes long time to reproduce :-().
>
> If we can or can not reproduce the issue with drm-tip, is a very useful
> datapoint for us. If we can not reproduce, it'll be possible to bisect
> which commit fixed it, and backport that. On the other hand, if it's
> still reproducible, we know we're not spending time on something we
> already fixed, and the priority gets a bump.

bisect ... is not practical on something that takes 2 days to reproduce.

> > If you think it is useful, I can try to update my machine to
> > linux-next.
>
> linux-next is closer to drm-tip, so it's better. Do you have some
> specific reason for not wanting to run drm-tip (but linux-next is still
> ok)?

I already have build/update scripts for -next, and I trust -next not
to store screenshots of my desktop in my master boot record :-).

Anyway, it does happen with -next. This time, chromiums were running,
and crash happened minute? after I exited flightgear. It can be seen
in the logs.

Oh and I might want to mention -- machine was rather deep in swap this
time, as in "mouse jumping when starting fgfs" and "could feel the
chromium being swapped back in". I might have had this situation
before, and just powercycled the machine "because it is so deep in
swap that it will not recover".

top says:

top - 19:18:24 up 2 days, 8:03, 2 users, load average: 3.02, 3.45,
3.21
Tasks: 141 total, 1 running, 86 sleeping, 0 stopped, 2 zombie
%Cpu(s): 18.8 us, 7.6 sy, 3.0 ni, 68.4 id, 1.3 wa, 0.0 hi, 0.9
si, 0.0 st
KiB Mem: 5967968 total, 663244 used, 5304724 free, 48876
buffers
KiB Swap: 1681428 total, 170904 used, 1510524 free. 446280
cached Mem

....but of course that memory is free once everything died.

Any ideas? Should I go back to v4.19 to see if it happens there, too?


Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Attachment: delme.gz
Description: application/gzip

Attachment: signature.asc
Description: Digital signature