FS PBX High Availability (HA) Setup Scripts — Database + File Replication

pbxgeek

Active Member
Jan 19, 2021
537
156
43
38
We’ve added documentation and new scripts for High Availability (HA) deployments in FS PBX
With two FS PBX nodes, you can now run your platform in Primary–Standby mode — both servers stay synchronized, ready for instant cutover if one fails.

Overview​

The HA setup uses bi-directional replication for both:
  1. PostgreSQL Database
    A bash automation script configures PostgreSQL logical replication in both directions.
    It sets up publications, subscriptions, replication slots, and peer firewall rules — creating a true master-to-master database link between two servers.
  2. Syncthing File Replication
    Another script installs and configures Syncthing on both servers, pairs them automatically, and shares the core FS PBX directories (recordings, voicemails, sounds, and cache).
    Files stay mirrored in near real time.

⚙️ Typical Architecture​

RoleDNS RecordExample
Primary Nodeserver1.fspbx.comHandles live calls and GUI
Standby Nodeserver2.fspbx.comContinuously synced
Floating DNSpbx.fspbx.comPoints to the active node
The floating record can be flipped manually or via a small health probe script (e.g., with Cloudflare or Route 53 API).

Requirements​

  • SSH key-based access between both nodes
  • Open ports:
    • 5432 (Postgres replication)

Documentation​

Step-by-step guides are now live in our documentation:
Each article includes the full automation script and detailed explanations of what it does.

Result​

Once configured:
  • Both servers continuously replicate databases + files
  • Failover takes only a DNS change
  • The standby can take over immediately with minimal interruption
Perfect for geo-redundant or mission-critical deployments.


Check out the full HA setup guide in the FS PBX Docs, and feel free to share your experiences or improvements in this thread!
 
  • Like
Reactions: DigitalDaz
This is not the only way, it's just one of the ways, and it works well when you have TTL set to a very low number, like 60 seconds. Then phones reregsitrer to the backup server fast.

How do you do it? @s2svoip
 
Nice, I was going to have a crack at this myself sometime soon. This is exactly, using the DNS method and route53, that I have been doing it for the last 10 years in FusionPBX

I usually leave my TTL at 120 secs that works just fine. I have route53 healthchecks on port 5060 of the primary, if it fails, route53 switches the ip to the secondary. Though I have never had a real failure, I have made a switchover for maintenance simply by stopping freeswitch on the primary.

Faillback is sometimes a pain, in going back to the primary, Yealinks will handle it fine usually and switchover quickly after the primary comes back online.

Ciscos seem particularly awkward and it seems that if the server is still up, despite the dns having the primary, they will not switch. I usually use iptables to temporarily block their source ips and this usually does the trick. Actually, thinking about it, stopping freeswitch on the secondary may achieve the same.