Hello all!
I ran into a self-inflicted problem with how routing is created by the ZeroTier client. First and foremost, this is a problem I created - I get that. I’m trying to understand if there are controls I’m missing so I can continue to make this work. Also, maybe in doing this I’ve found a bit of a performance bug.
So here’s the deal…
I have two sites (10.50.50.0/24 and 10.50.75.0/24). I’ve had a Linux host in the 10.50.75.0/24 subnet configured to be the ZT bridge to 10.50.75.0/24 for a while now. It’s just nice to be able to, from any machine, SSH over into hosts in that subnet. Then I decided I wanted to make that connectivity bi-directional. So I added the same routing structure to ZT (managed routes). Keep in mind there are no local network routes in the default gateway at either site. So only the ZT clients ever see that 10.50.75.0/24. The problem started when I added the 10.50.50.0/24 managed route in ZT. Since I’m primarily on 10.50.50.0/24 everything was fine until then. But after I added that any ZT client that was part of this ZT network started having local performance problems. Why? Because now there’s a lower metric route for 10.50.50.0/24.
See example…
Pre-ZT:
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.50.50.5 0.0.0.0 UG 100 0 0 eno1
10.50.50.0 0.0.0.0 255.255.255.0 U 100 0 0 eno1
169.254.0.0 0.0.0.0 255.255.0.0 U 1000 0 0 eno1
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
Post-ZT:
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.50.50.5 0.0.0.0 UG 100 0 0 eno1
10.50.25.0 0.0.0.0 255.255.255.0 U 0 0 0 ztr2qu6anx
10.50.50.0 10.50.25.119 255.255.255.0 UG 0 0 0 ztr2qu6anx
10.50.50.0 0.0.0.0 255.255.255.0 U 100 0 0 eno1
10.50.75.0 10.50.25.141 255.255.255.0 UG 0 0 0 ztr2qu6anx
169.254.0.0 0.0.0.0 255.255.0.0 U 1000 0 0 eno1
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
So it’s pretty obvious now that when I’m locally on 10.50.50.0/24 the preferred route is 10.50.25.119 instead of my default GW. I only realized this when I tried to move a large file to my NAS. Transferring files usually yields 900+Mb/s. This file transfer was maxing out around 2Mb/s. When I traced the path I realized immediately my failure.
Is there any way to control routes so that I only advertise external routes - or maybe where I can control the metric being set? I’d like to have these routes available. I thought about just removing the overlay routes and having the gateway router handle this. But this doesn’t work well for my laptop. In that situation if I leave my home network and jump on a hotspot or external network I’d have to spin up the ZT client (not the end of the world, but I like ZT because it’s transparent and “just works”).
Thoughts on better design or maybe there are controls I’m not using I should be?
I think it’s probably obvious but my ZT routes look as follows:
TIA!