Stress tested zerotier, not so good results

Hello,
I have done a stress test of zerotier with this configuration:

  • two ubuntu machines
  • three opnsense routers
  • three different zerotier networks
  • one zerotier A used as L2 and routes pushed with OSPF
  • one zerotier B with routes added using zerotier gui
  • one zerotier C without additional routes

I got these results:

  • in zerotier A I have 50% packet loss pinging and using zerotier addresses (I have not yet tried to ping hosts in routes pushed by ospf)
  • in zerotier B the linux client added a route to the wrong zerotier network (this route was added via gui after the zerotier client had three zerotier configured. Is it possible that in your code there is a bug that pushes new routes to the first zerotier it finds?)

I cannot find an explanation of packet loss. Zerotier B is in the same routers and hosts of zerotier A. So if it is a firewall/nat problem also B and C should be affected.

Can you help me?
Thanks,
Mario

I am not able to debug. It seems that if I add more than one opnsense it starts losing pings.
I now am trying with only ONE zerotier network but I have still problems.

linux machines seems to not lose pings

Can someone suggest me what to do? This was the final test before going in production. Should I buy now a professional subscription? (Consider that I am an MSP) Is opnsense supported by subscription? Should I file a bug?
Thanks,
Mario

Sounds like you’re relaying because of some sort of underlying networking/routing issue. To confirm this I’d first check zerotier-cli peers and make sure the nodes you’re trying to reach are listed as DIRECT and not RELAY.

just a guess, but opnsense people commonly run into a zerotier over zerotier feedback loop.
Here’s a link about it. I’mt tyring to link to a specific comment that explains the workaround steps. https://github.com/zerotier/ZeroTierOne/issues/779#issuecomment-767198156
if it’s applicable.

Ok I have read all your suggestions and I thank you all. I am trying to apply them.
Only one of the opnsense firewall is relaying (it is a virtual opnsense under a physical firewall). I have enabled upnp on the physical router, tried with a linux client and I see zerotier sets upnp rule on firewall.
But I am not able to force zerotier inside opnsense to use upnp. Is it disabled?

In one case I was able to forward port 9993 to opnsense (even if official Zerotier documentation says it is not suggested) and now it is DIRECT. In another one I cannot forward port and I confirm that zerotier in opnsense does not use upnp.

After many tests I have discovered that problems are due to OPNSense active/passive configuration.
Basically zerotier tries to use all internet connections in the master firewall (and this in most cases is ok, but if I want to use only one of the two WANs?).
But zerotier passes traffic also to slave firewall (why???) and if the slave firewall has no connection to internet (and it may happen because usually in active/passive configuration slave firewall disable internet access) I start losing 50% of packets.
I am getting upset about this because OPNSense is one of the few hardware platform I can use, I badly need it for my startup and I see I have no control over it.