Re: BTRFS Balance Hard System Crash (Blinking LEDs)
From: Nathan Royce
Date: Fri Mar 26 2021 - 12:52:49 EST
Oh man, I'm hoping things aren't starting to fall apart here.
I was doing my normal routine (tv, browsing, ... (no filesystem
manipulations)) and out of the blue "kodi" just crashes. It's actually
not all that uncommon, and I fired up "iotop" to make sure "coredump"
was happening, and it was.
I then did something else in the terminal, maybe an "ls", and that came up with:
*****
error while loading shared libraries: /usr/lib/libutil.so.1: ELF file
version does not match current one
*****
Again, it was just out of the blue. Same with other commands like
"coredumpctl" or "sync". Even "pacman -Qo /usr/lib/libutil.so.1"
caused SEGV.
Everything seemed fine after I had last booted (minus what I wrote in
my last email).
And the oddest thing is that, like I said before, my system/root stuff
(eg, /usr/lib/libutil.so.1) is being run from my sd-card (F2FS, not
BTRFS).
I see the coredumps were written out ~11:05, and journalctl started
showing issues arise ~10:56 (typically takes a long time to write out
on a slow sd-card):
*****
...
Mar 26 11:05:13 computerName systemd-coredump[70088]: Process 70078
(pacman) of user 1000 dumped core.
Stack trace of thread 70078:
#0 0x000075cf62ee58a5
do_lookup_x (ld-linux-x86-64.so.2 + 0xa8a5)
#1 0x000075cf62ee6231
_dl_lookup_symbol_x (ld-linux-x86-64.so.2 + 0xb231)
#2 0x000075cf62ee7dc7
_dl_relocate_object (ld-linux-x86-64.so.2 + 0xcdc7)
#3 0x000075cf62edfcdd
dl_main (ld-linux-x86-64.so.2 + 0x4cdd)
#4 0x000075cf62ef769f
_dl_sysdep_start (ld-linux-x86-64.so.2 + 0x1c69f)
#5 0x000075cf62edd063
_dl_start (ld-linux-x86-64.so.2 + 0x2063)
#6 0x000075cf62edc098
_start (ld-linux-x86-64.so.2 + 0x1098)
...
Mar 26 11:05:10 computerName kernel: Code: b4 24 d0 00 00 00 49 89 df
48 89 44 24 38 48 89 fb 4c 89 5c 24 60 eb 12 0f 1f 44 00 00 49 83 c4
04 83 e2 01 0f 85 f3 05 00 00 <41> 8b 04 24 48 89 c2 48 31 d8 48 d1 e8
75 e4 48 83 ec 08 4c 89 e0
Mar 26 11:05:09 computerName kernel: pacman[70078]: segfault at
75d29d0d5640 ip 000075cf62ee58a5 sp 00007ffffad0e460 error 4 in
ld-2.33.so[75cf62edc000+24000]
...
Mar 26 10:58:59 computerName kernel: Code: 84 e7 05 00 00 44 8b 33 45
85 f5 74 e4 66 0f ef ff 66 0f ef f6 66 0f ef e4 48 89 ef f3 0f 10 44
24 48 66 0f ef db 66 0f ef d2 <66> 0f 42 a0 61 72 70 cb ee 33 bb 14 5f
79 1d 76 e5 28 0f 11 44 24
Mar 26 10:58:59 computerName kernel: chrome[41222]: segfault at
ffffffffcb707262 ip 000077cb36b101ae sp 00007fff250007c0 error 5 in
i965_dri.so[77cb36aa9000+8fa000]
...
Mar 26 10:58:25 computerName plasmashell[43148]: KCrash: Application
Name = kate path = /usr/bin pid = 43148
Mar 26 10:58:25 computerName plasmashell[43148]: KCrash: crashing...
crashRecursionCounter = 2
Mar 26 10:58:07 computerName systemd[1424]: Started Kate - Advanced Text Editor.
...
Mar 26 10:56:51 computerName sudo[69237]: userName : TTY=pts/3 ;
PWD=/<path> ; USER=root ; COMMAND=/usr/bin/iotop
...
Mar 26 10:56:32 computerName kernel: audit: type=1701
audit(1616774192.320:455): auid=1000 uid=1000 gid=1000 ses=3 pid=54221
comm="VideoPlayer" exe="/usr/local/lib/kodi/kodi.bin" sig=11 res=1
Mar 26 10:56:32 computerName kernel: Code: 00 00 01 00 00 00 00 00 00
00 02 04 00 00 48 7b 00 00 10 49 4a d0 48 7b 00 00 00 00 00 00 01 00
00 00 55 00 00 00 00 00 00 00 <d9> d8 8f 34 4f 7b 00 00 00 39 10 d0 48
7b 00 00 00 00 fa 00 00 fa
Mar 26 10:56:32 computerName kernel: VideoPlayer[61823]: segfault at
7b48d0579730 ip 00007b48d0579730 sp 00007b48ff043248 error 15
...
*****
As you can see, pretty much everything was crashing (probably not
surprising if glibc is involved).
Now, like I said, I don't believe it's related to my BTRFS drive since
glibc was referenced which is located on my F2FS drive.
I ended up rebooting (again) and everything seems fine (so far) as I
write this and have the recorded DVR playing (kodi).
I don't know what those "kernel: Code:" is supposed to be/mean to me.
On Fri, Mar 26, 2021 at 8:29 AM Nathan Royce <nroycea+kernel@xxxxxxxxx> wrote:
>
> *****
> ...I "think" this is where the "emergency" drop out of boot occurred,
> and I just did a "systemctl reboot" which had the next boot succeed.
> Nope, I'm wrong. For whatever reason, this appears to be the boot that
> ended up working (searching for the first "microcode" reference
> indicating the start of a boot).
> Mar 25 21:44:17 computerName kernel: BTRFS critical (device dm-3):
> unable to add free space :-17
...