Sharing my experience to setup Zerotier in OPNsense and PFsense with OSPF

Just want to share my experience to the community about this great tools.
Zerotier is great tools and super powerful, great help for people WFH nowaday.
PLEASE also note that end point protection is super important when using ZT, you don’t want your devices get compromise or get ransomware because of any one of users.

  1. OPNsense install ZT and OSPF is straight forward. skipped.
  2. PFsense install ZT will need shell access. it still easy.
    Check the ZT version and find the package you want to install at here: https://pkg.opnsense.org/
    then in shell run this command:
    pkg add https://pkg.opnsense.org/xxxxxxxxxxxxxx/latest/All/zerotier-xxxxx.txz
    next install the web interface, find it in github: pfSense-pkg-zerotier
    upload to pfsense, you can use upload file function in GUI under Diag>command.
    then run “pkg install /path/pfSense-pkg-zerotier.pkg”
  • reference in chinese: removed, new user 2 links in a post.

then join you network (ZT bridge selected) , add a allow all firewall rule for the interface, then begin your test. after test add rules suit you.
you may need to aware that Ospf has its own protocol, not TCP nor UDP.

Setup OSPF, its easy and straight forward. Skipped.

so basically, installation and setup is pretty easy. But the real world is challenging.
After everything is setup ( I setup 3 nodes test with OPSF), I did a ping test, first 2 min is good, but after 2 mins, packet loss start showing up, then a few mins, its back to normal.
i kept monitor OSPF and found OPSF flap every 2-3 mins. at first I thought it was some OSPF settings, but I was wrong, wasted many hours here, so don’t want you guys repeat my steps. I noticed the flood in ZT interface, so I use packet capture to see what is those flood. And then found some internal subnet IP traffic flooding. After google it, some call it software laser, and the solution is add a blacklist in this file:
/var/db/zerotier-one/local.conf

{
“physical”: {
“192.168.0.0/16”: { “blacklist”: true }
}
}

Reference: links removed.
According to some source this should be fixed in 2021 but I still having this problem.
Change this file in OPNsense and PFsense is different, it caused me hour to found out.
you can create your file in PFsense, default did not have this file.
OPNsense has this file, but you cannot edit it directly in shell, it works but not after reboot. found out you need to use GUI to add this in OPNsense, because when you reboot it reload what is saved in GUI database.

After that, I think is all good, so reboot all PFsense and OPNsense.
then PFsense not working…
the ZT was started at boot
the OSPF saw the ZT interface for less than a min, then its down and removed from OSPF. because the ZT is not yet connected. OPN sense does not have this issue because you can lock the ZT interface, I guess.
Even the ZT UP and connected the OSPF did not automatically enable the interface, need manual restart FRR service.
Tried many methods delay startup of FRR service, cmdshell, etc. and wasted some hours.
found out what I need to do is, use PFsense FRR Global setting RAW Config, you need to use RAW config for this, because GUI no option for this.
so first copy your running config to saved config, then add “no shutdown” under ZT interface.

my config look like this (only the ZT interface part is shown here):

interface zt3tkicmx12345678
ip ospf network point-to-multipoint
ip ospf area 0.0.0.0
no shutdown

with this, pfsense can be all good after reboot.
now pfsense and opnsense all good after reboot.

OSPF up > ZT UP > routes propagated > ping test between network > OSPF route not flap anymore, no flood, no packet loss, all good.

Cento7 also tested, straight forward, no such problems. but I thought GUI firewall is easier for someone need to maintain it, then PFsense or OPNsense may be more easier for them.

Hopes this could save you some time.
Cheers and Stay safe.

Edit : as a new user only allow 2 links in a post, so some ref links are removed.

Link removed reference in chinese: 在pfSense中配置ZeroTier网络 | 鐵血男兒的BLOG
Link removed about flooding and blacklist : Reference: Packet flooding and high CPU usage · Issue #779 · zerotier/ZeroTierOne · GitHub

Have you used “prevent interface removal” in zerotier interface? If it works is easier than putting “no shutdown” in raw config.

I have the same problem. When I start ospf over zerotier (using OPNSense):

  1. cpu goes 100%
  2. serious flapping happens

Tried with ospf, ospf3 does not produce any routes, rip does not work.
Problem is that I have tried workarounds:

  • block dport 9993
  • physical blacklist
    And they do not work for me, may be cpu improves but problem of 33% ping lost persist.
    I have also this behaviour: considering that I have three routers with routes like:
  • 10.0.0.0
  • 10.1.0.0
  • 10.0.1.0
  • 10.129.0.0
    and so on I put in physical 10.0.0.0/8 in each router. But the 10 sec after I put this in a router it stops pinging, so I need to put a conservative route like 10.0.0.0/24

So it seems I do not understand physical meaning or there are other problems.

@mgiammarco Unfortunately, BSD doesn’t have a system call that we use on Linux to detect & prevent feedback loops. A user on GitHub has figured out how to use the interface prefix blacklist to prevent the feedback loops that you’re experiencing. Please see this GitHub Issue for more info on how to get things configured correctly. Darkain on that issue is doing the same thing as you: Using OSPF with OPNsense.

Hi mgiammarco, not sure about your networks topology, I have 4 ZT box (2 opnsense, 1 pfsense, 1 centos7, all works fine now) , here is some suggestion for you:

  1. you do not need to block port.
  2. physical blacklist, according to your subnets above (your internal subnet?) you can try put something like:

in OPNsense ZT local.conf settings

{
“physical”: {
“10.0.0.0/8”: { “blacklist”: true }
}
}

  1. after that stop zerotier, delete all files in peers folder (rm /var/db/zerotier-one/peers.d/*), this is important and I forgot to mention above. Then start zerotier, restart frr.

and “Prevent interface removal” yes only OPNsense has this, PFsense does not have this so I need to use raw config to add no shutdown.

Cheers.

1 Like

@zt-grant , yes this is very helpful, I have also read this so I solved my test.
Thanks !

Hi,
I have used Darkain workaround as you suggested.
But NO workaround worked until I cleaned all files in the peers.d directory as suggested by Kelvin_MCIT.
Please note that I had to clean peers.d not only in the three OPNSense firewall I used for tests but also in all linux machines that connected to these firewall using Zerotier.
And this solved some strange connectivity problems in these machines too.
I suppose this is a very important thing: in case of problems delete peers.d.

Thanks,
Mario

Great, I am glad that can help.

I have spoken too fast.
Now connection is good but I see again 100% cpu spikes.
In a peer/leaf path I see:

1 109.168.33.193/29996 false 1649405167043 1649405167013 1 0
1 10.0.1.1/29994 false 1649405058755 1649405167388

I suppose 10.0.1.1 should not be there.
I have tried with zte prefix exclusion.
Cleaned again peers.d
Mario

Oh, seems like your local.conf blacklist is not working.
I would suggest you double check the local.conf in the OPNsense GUI, you can only modify it using web GUI, so not edit the file directly, then delete the peer files, restart and test again.

I have cleaned peers.d also in my moon.
Now it seems all ok.
Can you confirm that if you have a moon you need to clean it too?
Thanks,
Mario

Oh I don’t have moon.

No I am getting crazy,
No matter what I do after few days cpu goes 100% for zerotier-one process.
I am looking in overview/peers, is it normal to have for example in paths:

active address expired lastReceive lastSend linkQuality preferred trustedPathId
1 188.164.131.137/29994 false 1649871947603 1649871949032 1 0
1 188.164.131.137/9993 false 1649871947625 1649871947598 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0
1 188.164.131.137/24970 false 1649871932616 1649871949032 0

so many times?

Zerotier starts but it ignores completely interfacePrefixBlackList directive.
How can I check if the interfaceprefixblacklist directive was accepted?
Thanks,
Mario

in information/config/physical I have this error:

physical Notice: Undefined index: physical in /usr/local/opnsense/mvc/app/cache/_usr_local_opnsense_mvc_app_views_opnsense_zerotier_overview.volt.php on line 48 false

This topic was automatically closed after 30 days. New replies are no longer allowed.