Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior

From: Jason Xing
Date: Mon Mar 13 2023 - 21:58:50 EST


On Tue, Mar 14, 2023 at 5:58 AM Kui-Feng Lee <sinquersw@xxxxxxxxx> wrote:
>
>
>
> On 3/11/23 08:36, Jason Xing wrote:
> > From: Jason Xing <kernelxing@xxxxxxxxxxx>
> >
> > When we encounter some performance issue and then get lost on how
> > to tune the budget limit and time limit in net_rx_action() function,
> > we can separately counting both of them to avoid the confusion.
> >
> > Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx>
> > ---
> > note: this commit is based on the link as below:
> > https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@xxxxxxxxx/
> > ---
> > include/linux/netdevice.h | 1 +
> > net/core/dev.c | 12 ++++++++----
> > net/core/net-procfs.c | 9 ++++++---
> > 3 files changed, 15 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 6a14b7b11766..5736311a2133 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -3157,6 +3157,7 @@ struct softnet_data {
> > /* stats */
> > unsigned int processed;
> > unsigned int time_squeeze;
> > + unsigned int budget_squeeze;
> > #ifdef CONFIG_RPS
> > struct softnet_data *rps_ipi_list;
> > #endif
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 253584777101..bed7a68fdb5d 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6637,6 +6637,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> > unsigned long time_limit = jiffies +
> > usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
> > int budget = READ_ONCE(netdev_budget);
> > + bool is_continue = true;
> > LIST_HEAD(list);
> > LIST_HEAD(repoll);
> >
> > @@ -6644,7 +6645,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> > list_splice_init(&sd->poll_list, &list);
> > local_irq_enable();
> >
> > - for (;;) {
> > + for (; is_continue;) {
> > struct napi_struct *n;
> >
> > skb_defer_free_flush(sd);
> > @@ -6662,10 +6663,13 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> > * Allow this to run for 2 jiffies since which will allow
> > * an average latency of 1.5/HZ.
> > */
> > - if (unlikely(budget <= 0 ||
> > - time_after_eq(jiffies, time_limit))) {
> > + if (unlikely(budget <= 0)) {
> > + sd->budget_squeeze++;
> > + is_continue = false;
> > + }
> > + if (unlikely(time_after_eq(jiffies, time_limit))) {
> > sd->time_squeeze++;
> > - break;
> > + is_continue = false;
> > }
> > }
> >
> > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
> > index 97a304e1957a..4d1a499d7c43 100644
> > --- a/net/core/net-procfs.c
> > +++ b/net/core/net-procfs.c
> > @@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
> > */
> > seq_printf(seq,
> > "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
> > - "%08x %08x\n",
> > - sd->processed, sd->dropped, sd->time_squeeze, 0,
> > + "%08x %08x %08x %08x\n",
> > + sd->processed, sd->dropped,
> > + 0, /* was old way to count time squeeze */
>
> Should we show a proximate number? For example,
> sd->time_squeeze + sd->bud_squeeze.

Yeah, It does make sense. Let the old way to display untouched.

>
>
> > + 0,
> > 0, 0, 0, 0, /* was fastroute */
> > 0, /* was cpu_collision */
> > sd->received_rps, flow_limit_count,
> > 0, /* was len of two backlog queues */
> > (int)seq->index,
> > - softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
> > + softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
> > + sd->time_squeeze, sd->budget_squeeze);
> > return 0;
> > }
> >