Weird connection to another NATed site

I have a ZeroTier Network that spans over two LANs in a pretty standard configuration: nodes behind NAT’ing DSL routers.

I’m trying to establish connectivity from various nodes on Network A/nodeA1,nodeA2 to Network B/nodeB1.

From nodeA1 the connectivity to nodeB1 is seamless, all good (nodes have DIRECT connection).
On the other hand connectivity from nodeA2 to nodeB2 doesn’t work. All I see that these peers see each other over RELAY, which shouldn’t be a problem per se, but in this case no packets get through.

I’m hosting the network on my own server.

And the issue was that the my server was running v. 1.8.6 while the peers were already on 1.8.10, upgrading the server to latest solved the issue.

I solved the issue while writing down the problem…

Hm… I thought I solved myself, by upgrading every node to the latest, but apparently this was only a temp solution, now I still see this weird relaying behavior.

Interestingly on nodeA2 (Network A/Node2) I see the peer NodeB2 periodically disappear and reappear – in RELAY connection. So it looks like it keeps trying to establish the peering but it doesn’t work.

It looks like the issue is that ZeroTier on Windows 2012 Server is not that reliable. By implementing a restart script / watchdog, that restarts ZeroTier if connection is lost, it seems to be all ok now.

#!/usr/bin/perl

use Net::Ping;
$timeout=5;
$host = 'another node in the same zt network';

# run this watchdog forever
while(1){
  $p=Net::Ping->new("icmp", $timeout) or die bye ;
  # ping host
  if($p->ping($host)){
    print "$host is alive \n";
  }
  else{
    print "$host not reachable\n";
    # stop and start zt if ping failed
    system('net stop ZeroTierOneService');
    sleep(20);
    system('net start ZeroTierOneService');
    sleep(60);
    # emit two pings to make things initiated
    $p->ping($host);
    $p->ping($host);
    # wait a bit
    sleep(10);
    $p->close;
    # go immediately to the next check cycle
    next; 
  }
  
$p->close;
# wait 10 minutes until next check (only if test was successful)
sleep(600);
}