Re: [patch] fix netconsole hang with alt-sysrq-t

From: Jeff Moyer
Date: Fri Aug 06 2004 - 15:35:38 EST


==> Regarding Re: [patch] fix netconsole hang with alt-sysrq-t; Matt Mackall <mpm@xxxxxxxxxxx> adds:

mpm> On Fri, Aug 06, 2004 at 03:29:27PM -0400, Jeff Moyer wrote:
>> Hi, Matt,
>>
>> Here's the patch. Sorry it took me so long, been busy with other work.
>> Two things which need perhaps more thinking, can netpoll_poll be called
>> recursively (it didn't look like it to me)

mpm> It can if the poll function does a printk or the like and wants to
mpm> recurse via netconsole. We could short-circuit that with an in_netpoll
mpm> flag, but let's worry about that separately.

Hmm, ok.

>> and do we care about the racy
>> nature of the netpoll_set_trap interface?

mpm> That should probably become an atomic now.

Ouch. I wanted to avoid that, but if we can't, we can't. Will
netpoll_set_trap then to an atomic_inc or an atomic_add? I've only seen it
called with 1 and 0, is that all that was intended?

>> You'll notice that I reverted part of an earlier changeset which caused us
>> to call the hard_start_xmit function even if netif_queue_stopped returned
>> true. This is a bug. I preserved the second part of that patch, which was
>> correct.

mpm> Ok, jgarzik pointed that out to me just a bit ago. I'm not sure if
mpm> we're dealing with the behavior that was intended to address yet
mpm> though. Stelian, can you try giving this a spin?

Well, we kept the second part of the patch, which deals with the
hard_start_xmit routine returning 1. That was a valid bug, I believe.

>> I've also bumped the budget from 1 to 16. As I mentioned, this was a
>> required change for netdump.

mpm> Should be fine.

>> This patch was tested on my dual hammer test system.

mpm> I'll have to re-rig my kgdb-over-ethernet test setup to test this, but
mpm> it looks good for now.

Yah, and I just noticed we don't want the poll_lock to be per struct
netpoll. It should be a static lock in the netpoll.c file. The problem is
that more than one netpoll object can reference the same ethernet device.

Thanks,

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/