Can’t do direct connection between two nodes

Are those static addresses in your own routers for the networks used by node1 and node2 respectively?

No. They are static addresses assigned by the building network. For their own network, it is just the typical 192.168.* address.

btw, in the following

7dbc***** 1.10.1 LEAF -1 RELAY
8e1b***** 1.10.3 LEAF 201 DIRECT 43397 43395 100.1.1.1/61567

I installed zerotier on one of the routers, which is node 8e1b*. You can see that I can have a direction connection. However, node 7dbc*, which is within this router, is in relay mode.

So what internal addresses do you get on your own routers, ie WAN respective LAN?

Btw regarding 7dbc, do you mean “which is within this router” like a node on your local network?

I don’t quite get it. the router’s internal address is something like 192.168.1.1

Re: 7dbc, it is connected to router 8e1b.

And now I tried to see peers again, now even the router is in relay mode. It seems to me that occasionally it could establish direction connection.

You might think I’m asking a lot of stupid questions but it’s somewhat hard to follow because I’m missing a network topology that explains the different parts of your setup, but anyhow here we go:

  • 8e1b is your main router (btw, what model is it?)
  • 192.168 is your local network?
  • What WAN (internet) ip address du you get on you router?
  • 7dbc is a node on you local network?

I’m just guessing here but it looks like you ran into a double-nat problem since the router is able to get a direct link. Just curious, why do you need ZT on your local devices (like the 7dbc) if the router can manage all the ZT network traffic for you?

reply inline.

Because my current router is slow, when it handles all zt traffic, it is quite slow. Until I get a dedicated routing machine, I’d like to be able establish direct connection.

Ok, then there is only one piece missing and that is the building network. If the whole building is sharing 100.1.1.1 as the public ip there must be an intermediate network between you and that address.

Picture this:
A (LAN ) <=> B (Building network/router w NAT) <=> C (Internet)

Where:

  • A: is your LAN (192.168)
  • B: is the building network that connects you router A to internet C via a dedicated building router B using NAT (or possible CG-NAT)
  • C: is the public IP 100.1.1.1 shared between all users in the building.

What does these commands say using the OpenWRT ssh login:

root@OpenWrt:~# ip address
root@OpenWrt:~# ip route 
root@OpenWrt:~# ifconfig
root@OpenWrt:~# ifstatus wan

Maybe I didn’t make it clear. 100.1.1.1 is the static ip that is assigned to my router by building’s network. The building has another public ip which is something a.b.c.d that is different than 100.1.1.1.

Let me do this
N1 (some device) <=> A1 (router, with openwrt installed and runs zerotier) <=> B (building router) <=> N0 (internet)
N2 (another device) <=> A2 (router) <=> B (building router)

Now N0 can have direct connection using zt with either N1 and N2 without any problem.
N2 can occasionally connect directly to A1 using zt.
N1 and N2 as always in relay mode.

What I am trying to fix is to have N1 and N2 be in direct connection mode.

Ok, I get it. However, then you actually get your own public IP (100.1.1.1 from Verizon) that can’t be shared by anyone else. What IP do you get from whatismyipaddress.com?

I think this is the last confusion. 100.1.1.1 is a static in the sense that it is a internal ip from the building network, but it is not a public ip. In whatismyipaddress.com, all devices will see a.b.c.d which is the shared ip by the entire building network.

Then you probably have two different issues that need to be addressed.

First I have to ask if you are just using the IP 100.1.1.1 as a hypothetical example? The reason I’m asking is that the IP range actually belongs to a public network and using it as an internal intermediate private network (ie as in the building network) might cause serious side effects.

And secondly, because your router WAN address 100.1.1.1 differs from the public IP we are back to the double-NAT problem, ie A (LAN IP 192.168 - NAT1 ) => B (Building subnet 100.1.1.x - NAT2) <=> C (Internet IP a.b.c.d).

IP hypothetical flow:

  1. N1 ZT Node1 = 192.168.1.10
  2. A1 router: gw addr 192.168.1.254 NAT-1 using WAN address 100.1.1.1 to B router at 100.1.1.254
  3. Building router 100.1.1.254 NAT-2 to internet using WAN address a.b.c.d
  4. The other way around to N2 (NAT3/NAT4)

If the path through N1 NAT1/NAT2 and on to N2 NAT3/NAT4 cannot be opened by, for example, hole punching, ZT won’t be able to establish a direct link. If the building’s internal network allows direct communication at IP level within the same subnet (as it ought to), you should be able to communicate directly between 100.1.1.1 and 100.1.1.2 but since this doesn’t work, something is missing. Sometimes it might help to open “helper ports” in the router and point those to the zt node using dst-nat.

However, it will be difficult to continue this hypothetical discussion since you are not willing to share more of concrete details.

Good luck!

Thanks for your patience. Let me draw a diagram and use more realistic IPs.

In the diagram, device n2 can ping router A and with zerotier, they establish a direct connection.

Any device from the internet can connect directly to device N1 and N2 using zerotier.

Router A and router B can see each other (ping)

Now the question is that I can’t establish direct connection using zerotier between N1 and N2.

Now I really understand your dilemma and given what you explained here there really shouldn’t be any problem with a direct link between N1 and N2. Really strange indeed!

My guess is that there is a configuration issue in the router(s) and/or possibly in the nodes as well. What os are you using to run the nodes btw?

Just to rule out any possible odd firewall rules or possble double nat if you happen to use masquerade/src-nat in also the n1/n2 nodes you might temporary connect (route) your 1 and 100 networks together by bypassing the firewall and add routes like “route add 192.168.100/24 dest 10.65.1.142” on A and “route add 192.168.1/24 dest 10.65.1.147” on router B.

If you’re able to reach all nodes on both networks, you can test again and see if ZT can possibly establish a direct link this time.

The two devices that can’t establish direct connections run on OS X.

Thanks for your suggestions. I will test it.

If you run out of ideas with “trial and error”, it might be time to start tracing the network traffic on your LAN and the roter using for example wireshark, tcpdump, dtruss or similar tools in order to locate the paths that the ZT nodes choose to take between the local networks.

Another option is to enable debug logging like in Constant high CPU usage on Raspberry Pi 4 - #6 by zt-joseph

If those are MacBooks you can put them on the same network and see if they connect ZT wise.

Just a quick update. With the static routing between two subnets, now devices under these two subnets can establish direct connection using ZeroTier (albeit redundant). Anyway, it is still weird why they couldn’t without the static routing. Do you mind giving me some pointers on how to use the aforementioned tools for diagnosis?

Yuhong

Glad to hear you finaly had some success!

Since it worked out using static routes over the B network, it’s most likely just a configuration issue in your routers that restrict nat traversal for that partical subnet.

Google “openwrt allow hole punching for zerotier”, “openwrt allow nat traversal for zerotier” etc or ask the OpenWRT forum. Also checkout OpenWRT for NAT examples, reflection and refection zones.

Troubleshooting this type of problem might take some time but when you finally manage to find the root cause it’s usually pretty obvious why it didn’t work (but harder to get there obviously). As for tracing network traffic I ususally prefer WireShark and router logs and try to follow the packet flow from startpoint to endpoint.

Please feel free to report back when you’ve managed to identify the root cause and how you solved it!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.