Re: [PATCH] virtio_net: Remove BUG() to aviod machine dead

From: Leon Romanovsky
Date: Wed Jun 02 2021 - 08:54:15 EST


On Wed, Jun 02, 2021 at 03:14:50PM +0800, Jason Wang wrote:
>
> 在 2021/6/2 下午1:59, Leon Romanovsky 写道:
> > On Tue, May 25, 2021 at 02:19:03PM +0800, Jason Wang wrote:
> > > 在 2021/5/19 下午10:18, Xianting Tian 写道:
> > > > thanks, I submit the patch as commented by Andrew
> > > > https://lkml.org/lkml/2021/5/18/256
> > > >
> > > > Actually, if xmit_skb() returns error, below code will give a warning
> > > > with error code.
> > > >
> > > >     /* Try to transmit */
> > > >     err = xmit_skb(sq, skb);
> > > >
> > > >     /* This should not happen! */
> > > >     if (unlikely(err)) {
> > > >         dev->stats.tx_fifo_errors++;
> > > >         if (net_ratelimit())
> > > >             dev_warn(&dev->dev,
> > > >                  "Unexpected TXQ (%d) queue failure: %d\n",
> > > >                  qnum, err);
> > > >         dev->stats.tx_dropped++;
> > > >         dev_kfree_skb_any(skb);
> > > >         return NETDEV_TX_OK;
> > > >     }
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道:
> > > > > typo in subject
> > > > >
> > > > > On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote:
> > > > > > When met error, we output a print to avoid a BUG().
> > >
> > > So you don't explain why you need to remove BUG(). I think it deserve a
> > > BUG().
> > BUG() will crash the machine and virtio_net is not kernel core
> > functionality that must stop the machine to prevent anything truly
> > harmful and basic.
>
>
> Note that the BUG() here is not for virtio-net itself. It tells us that a
> bug was found by virtio-net.
>
> That is, the one that produces the skb has a bug, usually it's the network
> core.
>
> There could also be the issue of the packet from untrusted source (userspace
> like TAP or packet socket) but they should be validated there.

So it is even worse than I thought. You are saying that in theory untrusted
remote host can crash system. IMHO, It is definitely not the place to put BUG().

I remind you that in-kernel API is build on the promise that data passed
between and calls are safe and already checked. You don't need to set a
protection from the net/core.

Thanks

>
> Thanks
>
>
> >
> > I would argue that code in drivers/* shouldn't call BUG() macros at all.
> >
> > If it is impossible, don't check for that or add WARN_ON() and recover,
> > but don't crash whole system.
> >
> > Thanks
> >
>