Re: Hard lockups using 3.10.0

From: Rolf Eike Beer
Date: Sun Aug 11 2013 - 07:10:34 EST


Borislav Petkov wrote:
> On Sun, Aug 11, 2013 at 08:09:19AM +0200, Rolf Eike Beer wrote:
> > Meanwhile I found that there was a hardware defect on this machine.
> > So if it does not happen again I will assume that this was caused by
> > this.
>
> What hardware defect exactly? DIMMs failing...? Probably, since it looks
> like the spinlock gets corrupted and the assertion fires... In any case,
> it would be interesting to know for future reference.

The RAM seems fine. It looks like it is the mainboard or a harddisk. The issues
have magically disappeared since 3 weeks, but I have not done any suspend2disk
since then anymore. Before that I had suspended the machine on the evening and
resumed when I came to work. So it's possible that there was some corrupted
stuff in the image.

This is the smart output I got of one disk yesterday:

Vendor: /0:0:0:0
Product:
User Capacity: 600,332,565,813,390,450 bytes [600 PB]
Logical block size: 774843950 bytes
scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46
>> Terminate command early due to bad response to IEC mode page
A mandatory SMART command failed: exiting. To continue, add one or more '-T
permissive' options.

Eike

Attachment: signature.asc
Description: This is a digitally signed message part.