Re: 3.13.5 : rm -rf running forever, one cpu at approx 100%

From: Ken Moffat
Date: Mon Mar 03 2014 - 16:16:43 EST


On Thu, Feb 27, 2014 at 05:52:42AM +0100, Mike Galbraith wrote:
> On Thu, 2014-02-27 at 03:45 +0000, Ken Moffat wrote:
> > On Thu, Feb 27, 2014 at 04:26:35AM +0100, Mike Galbraith wrote:
> > >
> > > I would start with strace to see if a task is looping in userspace, then
> > > move on to perf top -g -p <pid> (or perf record/report) to peek at what
> > > it's up to in the kernel. Once you have the where, trace_printk() is
> > > the best thing since sliced bread (which ranks just below printk()).
> > >
> > > -Mike
> > Thanks. I'll need to build perf.
>
> You may want to build the kernel with frame-pointers too, for easy gdb
> list *0x(hexnum) of *func()+0x(hexoffset) use. Crash is also pretty
> handy both for rummaging live via crash vmlinux /proc/kcore, and for
> leisurely postmortem analysis if you set the box up to crashdump in
> advance, and force a dump (poke sysrq-c or echo c > /proc/sysrq-trigger)
> when you see the bad thing happen. Crash has all kinds of goodies,
> including invocation of gdb.
>
> -Mike

In fact, it seems I had already enabled perf, but not built the
userspace part. However, I was motivated to build crash, and a
kernel with frame pointers plus a _lot_ of debug options.
Unsurprisingly, that ran like a dog, and many things seemed to use a
lot of CPU according to top (e.g. 'X', at least for the first
several minutes after it started), but I couldn't replicate the
problem. Also proved that my OpenJDK script doesn't normally trash
/etc/passwd- :)

Then, I went back to the original 3.13.5, proved that perf was
working, and built a new LFS system in chroot, again with 3.13.5, in
chroot. Booted it as soon as I'd built xorg, then went on through
glib/gtk, firefox, print/photo programs, libreoffice. No problems at
all, apart from the common "needed to use -j3 for gcc, because it's
a phonon". NB I forgot to mention that this is all 64-bit, and the
box has 8GB.

If something does happen here in the next few days (I've still got
to sort out my kde scripts and build that) then I'll reply to this.
But otherwise, I guess it must be cosmic rays, or maybe Gaffer forgot
to feed and muck-out the imps.

Äen
--
das eine Mal als TragÃdie, dieses Mal als Farce
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/