Re: [PATCH] hv_netvsc: Make sure out channel is fully opened on send
From: Mohammed Gamal
Date: Thu Mar 15 2018 - 12:24:27 EST
On Wed, 2018-03-14 at 10:22 +0100, Mohammed Gamal wrote:
> On Tue, 2018-03-13 at 12:35 -0700, Stephen Hemminger wrote:
> > On Tue, 13 Mar 2018 20:06:50 +0100
> > Mohammed Gamal <mgamal@xxxxxxxxxx> wrote:
> >
> > > Dring high network traffic changes to network interface
> > > parameters
> > > such as number of channels or MTU can cause a kernel panic with a
> > > NULL
> > > pointer dereference. This is due to netvsc_device_remove() being
> > > called and deallocating the channel ring buffers, which can then
> > > be
> > > accessed by netvsc_send_pkt() before they're allocated on calling
> > > netvsc_device_add()
> > >
> > > The patch fixes this problem by checking the channel state and
> > > returning
> > > ENODEV if not yet opened. We also move the call to
> > > hv_ringbuf_avail_percent()
> > > which may access the uninitialized ring buffer.
> > >
> > > Signed-off-by: Mohammed Gamal <mgamal@xxxxxxxxxx>
> > > ---
> > > Âdrivers/net/hyperv/netvsc.c | 5 +++--
> > > Â1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/net/hyperv/netvsc.c
> > > b/drivers/net/hyperv/netvsc.c
> > > index 0265d70..44a8358 100644
> > > --- a/drivers/net/hyperv/netvsc.c
> > > +++ b/drivers/net/hyperv/netvsc.c
> > > @@ -757,7 +757,7 @@ static inline int netvsc_send_pkt(
> > > Â struct netdev_queue *txq = netdev_get_tx_queue(ndev,
> > > packet->q_idx);
> > > Â u64 req_id;
> > > Â int ret;
> > > - u32 ring_avail = hv_ringbuf_avail_percent(&out_channel-
> > > > outbound);
> > >
> > > + u32 ring_avail;
> > > Â
> > > Â nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT;
> > > Â if (skb)
> > > @@ -773,7 +773,7 @@ static inline int netvsc_send_pkt(
> > > Â
> > > Â req_id = (ulong)skb;
> > > Â
> > > - if (out_channel->rescind)
> > > + if (out_channel->rescind || out_channel->state !=
> > > CHANNEL_OPENED_STATE)
> > > Â return -ENODEV;
> > > Â
> > > Â if (packet->page_buf_cnt) {
> > > @@ -791,6 +791,7 @@ static inline int netvsc_send_pkt(
> > > Â ÂÂÂÂÂÂÂVMBUS_DATA_PACKET_FLAG_CO
> > > MP
> > > LETION_REQUESTED);
> > > Â }
> > > Â
> > > + ring_avail = hv_ringbuf_avail_percent(&out_channel-
> > > > outbound);
> > >
> > > Â if (ret == 0) {
> > > Â atomic_inc_return(&nvchan->queue_sends);
> > > Â
> >
> > Thanks for your patch. Yes there are races with the current update
> > logic. The root cause goes higher up in the flow; the send queues
> > should
> > be stopped before netvsc_device_remove is called. Solving it where
> > you tried
> > to is racy and not going to work reliably.
> >
> > Network patches should go to netdev@xxxxxxxxxxxxxxx
> >
> > You can't move the ring_avail check until after the
> > vmbus_sendpacket
> > because
> > that will break the flow control logic.
> >
>
> Why? I don't see ring_avail being used before that point.
Ah, stupid me. vmbus_sendpacket() will write to the ring buffer and
that means that ring_avail value will be different than the expected.
>
> > Instead, you should just move the avail_read check until just after
> > the existing rescind
> > check.
> >
> > Also, you shouldn't need to check for OPENED_STATE, just rescind is
> > enough.
>
> That rarely mitigated the race. channel->rescind flag is set on vmbus
> exit - called on module unload - and when a rescind offer is received
> from the host, which AFAICT doesn't happen on every call to
> netvsc_device_remove, so it's quite possible that the ringbuffer is
> accessed before it's allocated again on channel open and hence the
> check for OPENED_STAT - which is only set after all vmbus data is
> initialized.
>
Perhaps I haven't been clear enough. The NULL pointer dereference
happens in the call to hv_ringbuf_avail_percent() which is used to
calculate ring_avail.Â
So we need to stop the queues before calling it if the channel's ring
buffers haven't been allocated yet, but OTOH we should only stop the
queues based upon the value of ring_avail, so this leads into a chicken
and egg situation.Â
Is my observation here correct? Please correct me if I am wrong,
Stephen.