RE: Slowness forming TIPC cluster with explicit node addresses

From: Jon Maloy
Date: Sun Aug 04 2019 - 17:58:38 EST




> -----Original Message-----
> From: netdev-owner@xxxxxxxxxxxxxxx <netdev-owner@xxxxxxxxxxxxxxx> On
> Behalf Of Chris Packham
> Sent: 2-Aug-19 01:11
> To: Jon Maloy <jon.maloy@xxxxxxxxxxxx>; tipc-
> discussion@xxxxxxxxxxxxxxxxxxxxx
> Cc: netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: Slowness forming TIPC cluster with explicit node addresses
>
> On Mon, 2019-07-29 at 09:04 +1200, Chris Packham wrote:
> > On Fri, 2019-07-26 at 13:31 +0000, Jon Maloy wrote:
> > >
> > >
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: netdev-owner@xxxxxxxxxxxxxxx <netdev-
> owner@xxxxxxxxxxxxxxx>
> > > > On Behalf Of Chris Packham
> > > > Sent: 25-Jul-19 19:37
> > > > To: tipc-discussion@xxxxxxxxxxxxxxxxxxxxx
> > > > Cc: netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> > > > Subject: Slowness forming TIPC cluster with explicit node
> > > > addresses
> > > >
> > > > Hi,
> > > >
> > > > I'm having problems forming a TIPC cluster between 2 nodes.
> > > >
> > > > This is the basic steps I'm going through on each node.
> > > >
> > > > modprobe tipc
> > > > ip link set eth2 up
> > > > tipc node set addr 1.1.5 # or 1.1.6 tipc bearer enable media eth
> > > > dev eth0
> > > eth2, I assume...
> > >
> > Yes sorry I keep switching between between Ethernet ports for testing
> > so I hand edited the email.
> >
> > >
> > > >
> > > >
> > > >
> > > > Then to confirm if the cluster is formed I useÂtipc link list
> > > >
> > > > [root@node-5 ~]# tipc link list
> > > > broadcast-link: up
> > > > ...
> > > >
> > > > Looking at tcpdump the two nodes are sending packets
> > > >
> > > > 22:30:05.782320 TIPC v2.0 1.1.5 > 0.0.0, headerlength 60 bytes,
> > > > MessageSize
> > > > 76 bytes, Neighbor Detection Protocol internal, messageType Link
> > > > request
> > > > 22:30:05.863555 TIPC v2.0 1.1.6 > 0.0.0, headerlength 60 bytes,
> > > > MessageSize
> > > > 76 bytes, Neighbor Detection Protocol internal, messageType Link
> > > > request
> > > >
> > > > Eventually (after a few minutes) the link does come up
> > > >
> > > > [root@node-6Â~]# tipc link list
> > > > broadcast-link: up
> > > > 1001006:eth2-1001005:eth2: up
> > > >
> > > > [root@node-5Â~]# tipc link list
> > > > broadcast-link: up
> > > > 1001005:eth2-1001006:eth2: up
> > > >
> > > > When I remove the "tipc node set addr" things seem to kick into
> > > > life straight away
> > > >
> > > > [root@node-5 ~]# tipc link list
> > > > broadcast-link: up
> > > > 0050b61bd2aa:eth2-0050b61e6dfa:eth2: up
> > > >
> > > > So there appears to be some difference in behaviour between having
> > > > an explicit node address and using the default. Unfortunately our
> > > > application relies on setting the node addresses.
> > > I do this many times a day, without any problems. If there would be
> > > any time difference, I would expect the 'auto configurable' version
> > > to be slower, because it involves a DAD step.
> > > Are you sure you don't have any other nodes running in your system?
> > >
> > > ///jon
> > >
> > Nope the two nodes are connected back to back. Does the number of
> > Ethernet interfaces make a difference? As you can see I've got 3 on
> > each node. One is completely disconnected, one is for booting over
> > TFTP
> > Â(only used by U-boot) and the other is the USB Ethernet I'm using for
> > testing.
> >
>
> So I can still reproduce this on nodes that only have one network interface and
> are the only things connected.
>
> I did find one thing that helps
>
> diff --git a/net/tipc/discover.c b/net/tipc/discover.c index
> c138d68e8a69..49921dad404a 100644
> --- a/net/tipc/discover.c
> +++ b/net/tipc/discover.c
> @@ -358,10 +358,10 @@ int tipc_disc_create(struct net *net, struct
> tipc_bearer *b,
> ÂÂÂÂÂÂÂÂtipc_disc_init_msg(net, d->skb, DSC_REQ_MSG, b);
>
> ÂÂÂÂÂÂÂÂ/* Do we need an address trial period first ? */
> -ÂÂÂÂÂÂÂif (!tipc_own_addr(net)) {
> +//ÂÂÂÂÂif (!tipc_own_addr(net)) {
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂtn->addr_trial_end = jiffies + msecs_to_jiffies(1000);
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂmsg_set_type(buf_msg(d->skb), DSC_TRIAL_MSG);
> -ÂÂÂÂÂÂÂ}
> +//ÂÂÂÂÂ}
> ÂÂÂÂÂÂÂÂmemcpy(&d->dest, dest, sizeof(*dest));
> ÂÂÂÂÂÂÂÂd->net = net;
> ÂÂÂÂÂÂÂÂd->bearer_id = b->identity;
>
> I think because with pre-configured addresses the duplicate address detection
> is skipped the shorter init phase is skipped. Would is make sense to
> unconditionally do the trial step? Or is there some better way to get things to
> transition with pre-assigned addresses.

I am on vacation until the end of next-week, so I can't give you any good analysis right now.
To do the trial step doesnât make much sense to me, -it would only delay the setup unnecessarily (but with only 1 second).
Can you check the initial value of addr_trial_end when there a pre-configured address?

///jon