Re: [PATCH RESEND] can: j1939: do not wait 250ms if the same addr was already claimed

From: Devid Antonio Filoni
Date: Tue May 10 2022 - 07:02:27 EST


Hi,

On Tue, 2022-05-10 at 06:26 +0200, Oleksij Rempel wrote:
> Hi,
>
> On Mon, May 09, 2022 at 09:04:06PM +0200, Kurt Van Dijck wrote:
> > On ma, 09 mei 2022 19:03:03 +0200, Devid Antonio Filoni wrote:
> > > This is not explicitly stated in SAE J1939-21 and some tools used for
> > > ISO-11783 certification do not expect this wait.
>
> It will be interesting to know which certification tool do not expect it and
> what explanation is used if it fails?
>
> > IMHO, the current behaviour is not explicitely stated, but nor is the opposite.
> > And if I'm not mistaken, this introduces a 250msec delay.
> >
> > 1. If you want to avoid the 250msec gap, you should avoid to contest the same address.
> >
> > 2. It's a balance between predictability and flexibility, but if you try to accomplish both,
> > as your patch suggests, there is slight time-window until the current owner responds,
> > in which it may be confusing which node has the address. It depends on how much history
> > you have collected on the bus.
> >
> > I'm sure that this problem decreases with increasing processing power on the nodes,
> > but bigger internal queues also increase this window.
> >
> > It would certainly help if you describe how the current implementation fails.
> >
> > Would decreasing the dead time to 50msec help in such case.
> >
> > Kind regards,
> > Kurt
> >
>

The test that is being executed during the ISOBUS compliance is the
following: after an address has been claimed by a CF (#1), another CF
(#2) sends a message (other than address-claim) using the same address
claimed by CF #1.

As per ISO11783-5 standard, if a CF receives a message, other than the
address-claimed message, which uses the CF's own SA, then the CF (#1):
- shall send the address-claim message to the Global address;
- shall activate a diagnostic trouble code with SPN = 2000+SA and FMI =
31

After the address-claim message is sent by CF #1, as per ISO11783-5
standard:
- If the name of the CF #1 has a lower priority then the one of the CF
#2, the the CF #2 shall send its address-claim message and thus the CF
#1 shall send the cannot-claim-address message or shall execute again
the claim procedure with a new address
- If the name of the CF #1 has higher priority then the of the CF #2,
then the CF #2 shall send the cannot-claim-address message or shall
execute the claim procedure with a new address

Above conflict management is OK with current J1939 driver
implementation, however, since the driver always waits 250ms after
sending an address-claim message, the CF #1 cannot set the DTC. The DM1
message which is expected to be sent each second (as per J1939-73
standard) may not be sent.

Honestly, I don't know which company is doing the ISOBUS compliance
tests on our products and which tool they use as it was choosen by our
customer, however they did send us some CAN traces of previously
performed tests and we noticed that the DM1 message is sent 160ms after
the address-claim message (but it may also be lower then that), and this
is something that we cannot do because the driver blocks the application
from sending it.

28401.127146 1 18E6FFF0x Tx d 8 FE 26 FF FF FF FF FF FF //Message
with other CF's address
28401.167414 1 18EEFFF0x Rx d 8 15 76 D1 0B 00 86 00 A0 //Address
Claim - SA = F0
28401.349214 1 18FECAF0x Rx d 8 FF FF C0 08 1F 01 FF FF //DM1
28402.155774 1 18E6FFF0x Tx d 8 FE 26 FF FF FF FF FF FF //Message
with other CF's address
28402.169455 1 18EEFFF0x Rx d 8 15 76 D1 0B 00 86 00 A0 //Address
Claim - SA = F0
28402.348226 1 18FECAF0x Rx d 8 FF FF C0 08 1F 02 FF FF //DM1
28403.182753 1 18E6FFF0x Tx d 8 FE 26 FF FF FF FF FF FF //Message
with other CF's address
28403.188648 1 18EEFFF0x Rx d 8 15 76 D1 0B 00 86 00 A0 //Address
Claim - SA = F0
28403.349328 1 18FECAF0x Rx d 8 FF FF C0 08 1F 03 FF FF //DM1
28404.349406 1 18FECAF0x Rx d 8 FF FF C0 08 1F 03 FF FF //DM1
28405.349740 1 18FECAF0x Rx d 8 FF FF C0 08 1F 03 FF FF //DM1

Since the 250ms wait is not explicitly stated, IMHO it should be up to
the user-space implementation to decide how to manage it.

Thank you,
Devid