Possibly misbehaving my.zerotier.com controllers? Or a connectivity regression in recent zerotier versions?

I’ve seen my networks intermittently show most clients as offline although they are not.
This started happening the last few weeks, the interruptions usually lasting for more than a day.
This seems equivalent to post Suddenly all members are Offline
In my case, the clients connect perfectly to each other using my own self-hosted network controller, but not Zerotier Central controllers, in my case e5cd7a9e1c and d3ecf5726d. “zerotier-cli peers” shows connection to these controllers to be relayed. The “last seen” time in Zerotier Central shows the same offline period for the inaccessible clients. Restarting the individual zerotier-one processes or rebooting the affected machines (linux arm64 / amd64, zerotier version 1.8.7) does not seem to help. Note that the affected machines are all part of multiple zerotier networks (my.zerotier.com) as well as via a self-hosted controller). Any ideas anyone?

Update: One of the clients reports (note new, unreleased version number):
d3ecf5726d 1.8.9 LEAF 111 DIRECT 8040 8040 35.222.184.52/29503
e5cd7a9e1c 1.8.9 LEAF 112 DIRECT 15499 4729 34.123.127.218/37225

At the same time, other clients report:
d3ecf5726d - LEAF -1 RELAY
e5cd7a9e1c - LEAF -1 RELAY

d3ecf5726d 1.8.5 LEAF -1 RELAY
e5cd7a9e1c - LEAF -1 RELAY

My guess is that network controller upgrades do not get picked up by individual clients quickly, which seems to be a bug.

If some clients are reporting RELAY, it means they’re not establishing a direct connection, likely due to something in the signal path between the node & controller. Our controllers are available directly on the internet. I’d check your systems to ensure there’s no overzealous firewall or packet inspection system deciding to block zerotier traffic.

Hi Grant, I’ve narrowed the problem down somewhat further, it wasn’t a firewall issue. The nodes that did not connect to the Zerotier Central network controllers were using a moon (custom root server) configuration, which worked flawlessly for many months, but stopped doing so recently; I think a recent ZT change no longer allows one to use his/her own root servers (set up to serve as root server for a particular network) to also connect to other (e.g. Zerotier Central) root servers for connectivity to other ZT networks. Interestingly, I can leave the moon configuration file on the custom network server itself active, allowing it to quickly find and connect to the respective client nodes, but not on those client nodes themselves. Anyway, problem worked around for now, maybe this information is useful to someone here.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.