Re: [systemd-devel] [RFC v2 3/6] kthread: warn on kill signal if not OOM

From: James Bottomley
Date: Wed Sep 10 2014 - 09:31:50 EST


On Wed, 2014-09-10 at 12:07 +0200, Ceriel Jacobs wrote:
> Tom Gundersen schreef op 10-09-14 om 08:46:
> >> >Indeed. What I proposed with a multiplier for the timeout for the
> >> >different types of built in commands was deemed complex but saw no
> >> >alternatives proposed despite my interest to work on one and
> >> >clarifications noted that this was a design regression. Not quite sure
> >> >what else I could have done here. I'm interested in learning what the
> >> >better approach is for the future as if we want to marry init + kernel
> >> >we need a smooth way for us to discuss design without getting worked
> >> >up about it, or taking it personal. I really want this to work as I
> >> >personally like systemd so far.
> > How about this: keep the timeout global, but also introduce a
> > (relatively short, say 10 or 15 seconds) timeout after which a warning
> > is printed. Even if nothing is actually killed, having workers (be it
> > insmod or something else) take longer than a couple of seconds is
> > likely a sign that something is seriously off somewhere.

> I don't agree with the statement that something is seriously off when it
> takes more then 10 to 15 seconds.
>
> When probing only one hard disk drive, then I do agree that something is
> seriously off after 10 to 15 seconds.

Really? We keep explaining that arbitrary times are wrong. A while ago
the Adaptec driver used to use 15s as its bus settle time after the
initial reset (it's now a Kconfig variable set at 5s) and a Parallel bus
takes a minimum of 4s to scan and has to be done sequentially. If any
probed device is having difficulty, that can escalate way beyond this
into the tens to hundreds of seconds. If your root disk is on it,
you're waiting or not booting.

> When probing a SAS bus with one hundred hard disk drives in standby
> mode, then I do expect that to take longer then 10 to 15 seconds.

Good luck with that even on SAS if you have a lot of expanders.

For an installed system, you know what you need (usually root and
possibly one other disc like /home), so you spawn all the insertions
asynchronously and then wait for just the devices you need them but,
since the alternative is panic when init isn't found, this wait better
be quite long (if not forever, given the consequence is guaranteed
failure). Everything else can be async, but, as I've pointed out
before, it can be async in user space (fire and forget) instead of the
kernel.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/