Re: lockups with netconsole on e1000 on media insertion

From: Matt Mackall
Date: Fri Aug 05 2005 - 18:22:02 EST


On Fri, Aug 05, 2005 at 11:56:50PM +0200, Andi Kleen wrote:
> > I still don't like this fix. Yes, you're right, it should eventually
> > give up. But here it gives up way too easily - 5 could easily
> > translate to 5 microseconds. This is analogous to giving up on serial
> > transmit if CTS is down for 5 loops.
> >
> > I'd be much happier if there were some udelay or the like in here so
> > that we're not giving up on such a short timeframe.
>
> Problem is that it could translate to a long aggregate delay
> e.g. when the kernel tries to dump the backlog after console_init.
> That is why I made the delay so short.

But why are we in a hurry to dump the backlog on the floor? Why are we
worrying about the performance of netpoll without the cable plugged in
at all? We shouldn't be optimizing the data loss case.

My primary concern here is that the loop have a non-negligible extent
in time. 5 loops is effectively equal to none. I'd be very surprised
if it was even enough for deglitching.

With serial console, we do polled I/O that runs at the serial rate -
milliseconds per line of output.

> Longer delay would be possible, but then it would need some logic
> to detect down links and don't delay on them and then retry later etc.
> Would be all far more complicated.

I think we could probably have subsequent failures be much shorter
without too much added complexity. But I'm not sure it matters.

--
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/