Re: [PATCH net-next] net: ipvs: random start for RR scheduler

From: Julian Anastasov
Date: Mon May 09 2022 - 14:25:50 EST



Hello,

On Mon, 9 May 2022, menglong8.dong@xxxxxxxxx wrote:

> From: Menglong Dong <imagedong@xxxxxxxxxxx>
>
> For now, the start of the RR scheduler is in the order of dest
> service added, it will result in imbalance if the load balance

...order of added destinations,...

> is done in client side and long connect is used.

..."long connections are used". Is this a case where
small number of connections are used? And the two connections
relatively overload the real servers?

> For example, we have client1, client2, ..., client5 and real service
> service1, service2, service3. All clients have the same ipvs config,
> and each of them will create 2 long TCP connect to the virtual
> service. Therefore, all the clients will connect to service1 and
> service2, leaving service3 free.

You mean, there are many IPVS directors with same
config and each director gets 2 connections? Third connection
will get real server #3, right ? Also, are the clients local
to the director?

> Fix this by randomize the start of dest service to RR scheduler when

..."randomizing the starting destination when"

> IP_VS_SVC_F_SCHED_RR_RANDOM is set.
>
> Signed-off-by: Menglong Dong <imagedong@xxxxxxxxxxx>
> ---
> include/uapi/linux/ip_vs.h | 2 ++
> net/netfilter/ipvs/ip_vs_rr.c | 25 ++++++++++++++++++++++++-
> 2 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/ip_vs.h b/include/uapi/linux/ip_vs.h
> index 4102ddcb4e14..7f74bafd3211 100644
> --- a/include/uapi/linux/ip_vs.h
> +++ b/include/uapi/linux/ip_vs.h
> @@ -28,6 +28,8 @@
> #define IP_VS_SVC_F_SCHED_SH_FALLBACK IP_VS_SVC_F_SCHED1 /* SH fallback */
> #define IP_VS_SVC_F_SCHED_SH_PORT IP_VS_SVC_F_SCHED2 /* SH use port */
>
> +#define IP_VS_SVC_F_SCHED_RR_RANDOM IP_VS_SVC_F_SCHED1 /* random start */
> +
> /*
> * Destination Server Flags
> */
> diff --git a/net/netfilter/ipvs/ip_vs_rr.c b/net/netfilter/ipvs/ip_vs_rr.c
> index 38495c6f6c7c..e309d97bdd08 100644
> --- a/net/netfilter/ipvs/ip_vs_rr.c
> +++ b/net/netfilter/ipvs/ip_vs_rr.c
> @@ -22,13 +22,36 @@
>
> #include <net/ip_vs.h>
>
> +static void ip_vs_rr_random_start(struct ip_vs_service *svc)
> +{
> + struct list_head *cur;
> + u32 start;
> +
> + if (!(svc->flags | IP_VS_SVC_F_SCHED_RR_RANDOM) ||

| -> &

> + svc->num_dests <= 1)
> + return;
> +
> + spin_lock_bh(&svc->sched_lock);
> + start = get_random_u32() % svc->num_dests;

May be prandom is more appropriate for non-crypto purposes.
Also, not sure if it is a good idea to limit the number of steps,
eg. to 128...

start = prandom_u32_max(min(svc->num_dests, 128U));

or just use

start = prandom_u32_max(svc->num_dests);

Also, this line can be before the spin_lock_bh.

> + cur = &svc->destinations;

cur = svc->sched_data;

... and to start from current svc->sched_data because
we are called for every added dest. Better to jump 0..127 steps
ahead, to avoid delay with long lists?

> + while (start--)
> + cur = cur->next;
> + svc->sched_data = cur;
> + spin_unlock_bh(&svc->sched_lock);
> +}
>
> static int ip_vs_rr_init_svc(struct ip_vs_service *svc)
> {
> svc->sched_data = &svc->destinations;
> + ip_vs_rr_random_start(svc);
> return 0;
> }
>
> +static int ip_vs_rr_add_dest(struct ip_vs_service *svc, struct ip_vs_dest *dest)
> +{
> + ip_vs_rr_random_start(svc);
> + return 0;
> +}
>
> static int ip_vs_rr_del_dest(struct ip_vs_service *svc, struct ip_vs_dest *dest)
> {
> @@ -104,7 +127,7 @@ static struct ip_vs_scheduler ip_vs_rr_scheduler = {
> .module = THIS_MODULE,
> .n_list = LIST_HEAD_INIT(ip_vs_rr_scheduler.n_list),
> .init_service = ip_vs_rr_init_svc,
> - .add_dest = NULL,
> + .add_dest = ip_vs_rr_add_dest,
> .del_dest = ip_vs_rr_del_dest,
> .schedule = ip_vs_rr_schedule,
> };
> --
> 2.36.0

Regards

--
Julian Anastasov <ja@xxxxxx>