Has anyone conducted perf tests with Mikrotik RB4011 that can share results?
I have a RB4011 with ZT and a NUC (N4020) and I cannot get more than 80/100 Mbit/sec. I have pushed RB4011 to 1800Mhz to be able to do this. Doing a test between an AMD Ryzen 5800X (with ZT agent installed) and the NUC (N4020) sitting on the other side, I get perf above 300Mbit/sec.
Tests were conducted using iperf3 between PC to PC (whereas the NUC on the remote end is both running ZT service and serving as Iperf3 server). No tests were done directly at the RB4011, so that I would not affect its performance.
While conducting the tests, I see that typically the RB4011 will consume only 2 of its 4 cores. That means that are still resources available to scale the perf of the connection. Is it a limitation of ZT itself? A limitation of ZT implementation from Mikrotik? Of this is simply what to expect given RouterOS and CPU/Core scheduler/thread management?
That might have an impact also, but like stated, CPU graph showed only 2 cores where being used (out of 4).
Apart from ZT usage, there’s naturally Mikrotik traffic (routing, etc) processing happening, which justifies the additional core usage (given ZT is single threaded).
Looks like RB5009 +cAP will be the way to go for me to be able to squeeze more out of my ZT setup.
IIRC, there are other important factors like AES acceleration which is available in the RB5009 and not on the RB4011.
RB5009 is known to be able to drive more than 400mbps of traffic via ZeroTier with ease. So, it is not about lust for new gear.
Also, RB4011 vs RB5009 benchmarks are not comparable, given that RB4011 results were achieved using ROS6 and RB5009 using ROS7. The underlying changes are considerable and there has been a lot of debate around those metrics.
ZeroTier can be very fast for what it is: a portable Layer 2 VPN over UDP implemented in userspace. On Linux and *BSD it’s implemented using the tap(4) pseuo-interface. Each tap interface appears as an Ethernet interfaces to the kernel network stack. Moving a Ethernet frames through the tap interfaces requires one system call per frame. The frames are encrypted/decrypted in userspace and tunneled over UDP sockets which again require one system per packet. These constraints limit how efficient the current implementation of ZeroTier can be. On a fast desktop you can push more than 1Gb/s through it with brute force, but the slower CPU cores used in low power routers lack the single thread CPU throughput to keep up with a desktop. The nice thing about ZeroTier is that it supports all common desktop operating systems (Windows, macOS, most Linux distros and even *BSDs) and requires local configuration (you only have to join correct set of networks). You don’t have to terminate high bandwidth VPNs on a network appliance.
If you feel comfortable managing network appliances like MikroTik routers you’ll find that the in-kernel WireGuard implementation available on all platforms since RouterOS v7 is a lot faster and can make use of multiple cores with a single tunnel interface, but you loose the ZeroTier automatic meshing and centralised control plane. One platforms with crypto acceleration IPsec can be even faster because it can be configured ciphers and modes that can be offloaded hardware, but it’s a massive pain in the posterior to deploy and operate especially across multiple vendors.
I recommend you analyse your options and pick the least painful one satisfying your security and performance requirements.
I personally prefer the easiness of setup that ZeroTier provides I hate the whole WG secret/config management as it is.
Despite using Mikrotik, doesn’t mean I have to make my life miserable and do everything manually and painful ehehehe
Maybe once WG and Mikrotik makes the whole management layer easier, I might reconsider WG.
My gut feeling is that the limitation is mostly due to sw encryption. When/if Mikrotik eventually chooses to upgrade Zerotier to a more current release, we might be able to get much better results using AES acceleration, maybe not as good as IPsec test results on RB4011 but hopefully significantly better than the current implementation.
Though, it would be very interesting to test the throughput without encryption but unfortunately there is no way to configure Trusted Path in RoS (AFAIK).
WireGuard (WG) is a good protocol but perhaps less pleasant to administer at larger volumes. That’s when SD-WAN solutions like ZeroTier come into the picture as an unique selling point (USP). “ZeroTier” = “Zero Administration” at least compared to WG.
However, there is actually a SD-WAN solution that utilize WG called Tailscale but you only get L3 routing whereas ZeroTier provides both L2/L3 which is another USP according to me.