RE: [RFC V3 PATCH 18/26] net/netpolicy: set tx queues according to policy

From: Liang, Kan
Date: Tue Sep 13 2016 - 08:23:05 EST




> -----Original Message-----
> From: Tom Herbert [mailto:tom@xxxxxxxxxxxxxxx]
> Sent: Monday, September 12, 2016 4:23 PM
> To: Liang, Kan <kan.liang@xxxxxxxxx>
> Cc: David S. Miller <davem@xxxxxxxxxxxxx>; LKML <linux-
> kernel@xxxxxxxxxxxxxxx>; Linux Kernel Network Developers
> <netdev@xxxxxxxxxxxxxxx>; Kirsher, Jeffrey T <jeffrey.t.kirsher@xxxxxxxxx>;
> Ingo Molnar <mingo@xxxxxxxxxx>; peterz@xxxxxxxxxxxxx; Alexey Kuznetsov
> <kuznet@xxxxxxxxxxxxx>; James Morris <jmorris@xxxxxxxxx>; Hideaki
> YOSHIFUJI <yoshfuji@xxxxxxxxxxxxxx>; Patrick McHardy <kaber@xxxxxxxxx>;
> akpm@xxxxxxxxxxxxxxxxxxxx; Kees Cook <keescook@xxxxxxxxxxxx>;
> viro@xxxxxxxxxxxxxxxxxx; gorcunov@xxxxxxxxxx; John Stultz
> <john.stultz@xxxxxxxxxx>; Alexander Duyck <aduyck@xxxxxxxxxxxx>; Ben
> Hutchings <ben@xxxxxxxxxxxxxxx>; David Decotigny <decot@xxxxxxxxxxxx>;
> Florian Westphal <fw@xxxxxxxxx>; Alexander Duyck
> <alexander.duyck@xxxxxxxxx>; Daniel Borkmann <daniel@xxxxxxxxxxxxx>;
> rdunlap@xxxxxxxxxxxxx; Cong Wang <xiyou.wangcong@xxxxxxxxx>; Hannes
> Frederic Sowa <hannes@xxxxxxxxxxxxxxxxxxx>; Stephen Hemminger
> <stephen@xxxxxxxxxxxxxxxxxx>; Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx>; Brandeburg, Jesse
> <jesse.brandeburg@xxxxxxxxx>; Andi Kleen <andi@xxxxxxxxxxxxxx>
> Subject: Re: [RFC V3 PATCH 18/26] net/netpolicy: set tx queues according to
> policy
>
> On Mon, Sep 12, 2016 at 7:55 AM, <kan.liang@xxxxxxxxx> wrote:
> > From: Kan Liang <kan.liang@xxxxxxxxx>
> >
> > When the device tries to transmit a packet, netdev_pick_tx is called
> > to find the available tx queues. If the net policy is applied, it
> > picks up the assigned tx queue from net policy subsystem, and redirect
> > the traffic to the assigned queue.
> >
> > Signed-off-by: Kan Liang <kan.liang@xxxxxxxxx>
> > ---
> > include/net/sock.h | 9 +++++++++
> > net/core/dev.c | 20 ++++++++++++++++++--
> > 2 files changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/net/sock.h b/include/net/sock.h index
> > e1e9e3d..ca97f35 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -2280,4 +2280,13 @@ extern int sysctl_optmem_max; extern __u32
> > sysctl_wmem_default; extern __u32 sysctl_rmem_default;
> >
> > +/* Return netpolicy instance information from socket. */ static
> > +inline struct netpolicy_instance *netpolicy_find_instance(struct sock
> > +*sk) { #ifdef CONFIG_NETPOLICY
> > + if (is_net_policy_valid(sk->sk_netpolicy.policy))
> > + return &sk->sk_netpolicy; #endif
> > + return NULL;
> > +}
> > #endif /* _SOCK_H */
> > diff --git a/net/core/dev.c b/net/core/dev.c index 34b5322..b9a8044
> > 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3266,6 +3266,7 @@ struct netdev_queue *netdev_pick_tx(struct
> net_device *dev,
> > struct sk_buff *skb,
> > void *accel_priv) {
> > + struct sock *sk = skb->sk;
> > int queue_index = 0;
> >
> > #ifdef CONFIG_XPS
> > @@ -3280,8 +3281,23 @@ struct netdev_queue *netdev_pick_tx(struct
> net_device *dev,
> > if (ops->ndo_select_queue)
> > queue_index = ops->ndo_select_queue(dev, skb, accel_priv,
> > __netdev_pick_tx);
> > - else
> > - queue_index = __netdev_pick_tx(dev, skb);
> > + else {
> > +#ifdef CONFIG_NETPOLICY
> > + struct netpolicy_instance *instance;
> > +
> > + queue_index = -1;
> > + if (dev->netpolicy && sk) {
> > + instance = netpolicy_find_instance(sk);
> > + if (instance) {
> > + if (!instance->dev)
> > + instance->dev = dev;
> > + queue_index = netpolicy_pick_queue(instance, false);
> > + }
> > + }
> > + if (queue_index < 0) #endif
>
> I doubt this produces the intended effect. Several drivers use
> ndo_select_queue (such as mlx4) where there might do something special
> for a few packets but end up called the default handler which
> __netdev_pick_tx for most packets. So in such cases the netpolicy path would
> be routinely bypassed. Maybe this code should be in __netdev_pick_tx.

I will move the code to __netdev_pick_tx in next version.

Thanks,
Kan

>
> Tom
>
> > + queue_index = __netdev_pick_tx(dev, skb);
> > + }
> >
> > if (!accel_priv)
> > queue_index = netdev_cap_txqueue(dev,
> > queue_index);
> > --
> > 2.5.5
> >