Re: Revert "gro: Fix legacy path napi_complete crash", (was: Re:Linux 2.6.29)

From: Sascha Hauer
Date: Tue Mar 24 2009 - 11:23:34 EST


Hi Ingo,

On Tue, Mar 24, 2009 at 03:39:42PM +0100, Ingo Molnar wrote:
>
> * Robert Schwebel <r.schwebel@xxxxxxxxxxxxxx> wrote:
>
> > On Tue, Mar 24, 2009 at 02:02:02PM +0100, Ingo Molnar wrote:
> > > If the box hung within 15 minutes, the kernel was deemed bad. Using
> > > that method i arrived to this upstream networking fix which was
> > > merged yesterday:
> > >
> > > | 303c6a0251852ecbdc5c15e466dcaff5971f7517 is first bad commit
> > > | commit 303c6a0251852ecbdc5c15e466dcaff5971f7517
> > > | Author: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> > > | Date: Tue Mar 17 13:11:29 2009 -0700
> > > |
> > > | gro: Fix legacy path napi_complete crash
> >
> > This commit breaks nfsroot booting on i.MX27 and other ARM boxes
> > with different network cards here in a reproducable way.
>
> Can you confirm that Herbert's fix (see it below) solves the
> problem?

No, still doesn't work.

It seems to have something to do with enabling interrupts between
__skb_dequeue() and __napi_complete().

I reverted 303c6a0251852ecbdc5c15e466dcaff5971f7517 and added a

local_irq_enable(); local_irq_disable();

right before __napi_complete() and this already breaks networking.


Sascha

>
> Ingo
>
> --------------->
> From b8b66ac07cab1b45aac93e4f406833a1e0d7677e Mon Sep 17 00:00:00 2001
> From: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> Date: Tue, 24 Mar 2009 21:35:42 +0800
> Subject: [PATCH] net: Fix netpoll lockup in legacy receive path
>
> When I fixed the GRO crash in the legacy receive path I used
> napi_complete to replace __napi_complete. Unfortunately they're
> not the same when NETPOLL is enabled, which may result in us
> not calling __napi_complete at all.
>
> While this is fishy in itself, let's make the obvious fix right
> now of reverting to the previous state where we always called
> __napi_complete.
>
> Signed-off-by: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Frank Blaschka <blaschka@xxxxxxxxxxxxxxxxxx>
> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> LKML-Reference: <20090324133542.GA29046@xxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
> ---
> net/core/dev.c | 16 +++++++++-------
> 1 files changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index e3fe5c7..523f53e 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2580,24 +2580,26 @@ static int process_backlog(struct napi_struct *napi, int quota)
> int work = 0;
> struct softnet_data *queue = &__get_cpu_var(softnet_data);
> unsigned long start_time = jiffies;
> + struct sk_buff *skb;
>
> napi->weight = weight_p;
> do {
> - struct sk_buff *skb;
> -
> local_irq_disable();
> skb = __skb_dequeue(&queue->input_pkt_queue);
> - if (!skb) {
> - local_irq_enable();
> - napi_complete(napi);
> - goto out;
> - }
> local_irq_enable();
> + if (!skb)
> + break;
>
> napi_gro_receive(napi, skb);
> } while (++work < quota && jiffies == start_time);
>
> napi_gro_flush(napi);
> + if (skb)
> + goto out;
> +
> + local_irq_disable();
> + __napi_complete(napi);
> + local_irq_enable();
>
> out:
> return work;
>

--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/