[Tutorial] Creating a two node FusionPBX cluster the easy way.

Hello,

So just to be up to date on my note since I tested DigitalDaz HA script last year, An HA script is available if we are a member on FusionPBX website? I'm really looking forward to being a member within a few days if I can get confirmation about an HA script or installation tutorial being available!

We will be deploying FusionPBX surely so HA is a must for us if we can transfer away from our current software.

Thanks in advance for an answer!
 

markjcrane

Active Member
Staff member
Yes the FusionPBX members get access to videos of previous training for Admin and Advanced classes and the documentation that goes with those. That includes instructions for multi-master replication for the database and file system and many other things to scale and provide a great experience to your users.
 
Hello guys,

I have an issue. After a master disaster recovery, data can be write/synced in just one direction Slave to Master. All the changes made in Master database will not be replicated to Slave. What can be done to restore it? Regards!
 
Nothing at all. With BDR, last write wins, so if they have resynced, the state you have now is the final one.
Thanks a lot!
I was able to restore Master and Slave to the state before Master disaster. But now i am facing some freeswitch data conflict on Slave postgresql Logs. what can be done to restore this Freeswitch data sync? Regards!

2019-05-12 21:07:18 UTC [17488-852] [unknown]@freeswitch CONTEXT: apply UPDATE from remote relation public.sip_registrations in commit 8/D1048E90, xid 6593672 commited at 2019-05-12 21:07:18.560747+00 (action #14) from node (6593351633762540988,1,16386)
2019-05-12 21:07:18 UTC [17488-853] [unknown]@freeswitch LOG: CONFLICT: remote UPDATE: could not find existing row. Resolution: skip_change; PKEY:
 

DigitalDaz

Administrator
Staff member
Clearly, you haven't restored exactly and now you are in a worse position with a conflict, you shouldnever restore over BDR in my opinion.

I you have a good copy of the data I would now tear down the BDR cluster and restart from scratch and then restore your data to one of them.
 
Are there any advantages to using BDR as opposed to just doing a backup/restore daily from primary to secondary? There is a public script for that. Works just fine and keeps things simple. The only gotcha I have found so far is that CDR calls on the secondary will be lost the next time the DB it is restored from the primary. So I make backups of that. Not a big deal because the secondary is strictly a failsafe that will only be used rarely.

I am using PostgreSQL v11. I would rather stick with that moving forward rather than having to use 9.4.
 
Last edited:

markjcrane

Active Member
Staff member
Yes the advantage for multi-master is scale-ability. It means you can handle more work than a single server can handle. And also instant failover with no loss of information or very little loss of information depending on latency between the servers. Compared to backup and restore script that I wrote and shared.

"The only gotcha I have found so far is that CDR calls on the secondary will be lost the next time the DB it is restored from the primary."
Think about this more and you will find more issues. The larger you grow the more these issues will grow. Things like call forward and follow me, voicemail greetings and messages, fax, provisioning changes and a lot more depending on features that are used.

To your second point I'm working on a BDR replacement so we can use latest PostgreSQL.
 
Last edited:

DigitalDaz

Administrator
Staff member
Unless you understand every single piece of the replication you are using very well and are confident to troubleshoot it you should be replicating only using the fusionpbx taught methods.
 
Unless you understand every single piece of the replication you are using very well and are confident to troubleshoot it you should be replicating only using the fusionpbx taught methods.
You have been saying that your procedure is for failover only, not load balancing where both are used at the same time. Mark is describing a multi-master setup that implies load balancing. So I am a little confused what you are recommending now.

I'm still hoping someone can answer my question if backup/restore primary to secondary will work for a simple failover pair. Not load balancing, just for hot standby. I could add multi-master replication if/when it makes sense. I can get it to switch over within a minute or two using DNS failover so I think that is good enough for what I have in mind. Not perfect but relatively simple.
 
Last edited:
Your backup/restore method will work for basic failover.
Mark's method is better for fail over because less data is lost compared to a nightly backup and frequently changed settings are usually "up to date" virtually instantly. Depending on the fail over method used and phones in use, a clustered system can be set to allow for basically zero down time.

The cluster setup allows distributing the load across servers by dividing you customer domains between nodes of the cluster.
Example: host1.domain.com, host2.domain.com, host3.domain.com
customer1.domain.com on host1.domain.com failover to host2.domain.com
customer2.domain.com on host2.domain.com failover to host3.domain.com
customer3.domain.com on host3.domain.com failover to host1.domain.com
or really any variation of customer domains routing to any host as primary and any other host as the failover.

As a superadmin, any tenant can be administered from any other online hosts.

My personal setup using BDR is 3 servers in a cluster and a 4th one as an alternate backup.
Most tenants connect to server1 with server3 as automatic failover destination via DNS settings.
One high use (tenant connects to server2 with server3 as the automatic failover.
I have backup scripts setup to keep server4 configured for the same data as what is in the cluster with a weeks worth of nightly backup versions.

Servers 1-3 are in various datacenters while server4 is in our local office. The cluster will handle normal networking or hosting provider issues while server4 is only used if "all hell breaks loose"
 

DigitalDaz

Administrator
Staff member
You have been saying that your procedure is for failover only, not load balancing where both are used at the same time. Mark is describing a multi-master setup that implies load balancing. So I am a little confused what you are recommending now.
I was talking active/passive using pairs. There is little point trying to load balance a pair because if one goes down, the secondary has to support the whole load anyway.
 
Your backup/restore method will work for basic failover.
Mark's method is better for fail over because less data is lost compared to a nightly backup and frequently changed settings are usually "up to date" virtually instantly. Depending on the fail over method used and phones in use, a clustered system can be set to allow for basically zero down time.

The cluster setup allows distributing the load across servers by dividing you customer domains between nodes of the cluster.
Example: host1.domain.com, host2.domain.com, host3.domain.com
customer1.domain.com on host1.domain.com failover to host2.domain.com
customer2.domain.com on host2.domain.com failover to host3.domain.com
customer3.domain.com on host3.domain.com failover to host1.domain.com
or really any variation of customer domains routing to any host as primary and any other host as the failover.

As a superadmin, any tenant can be administered from any other online hosts.

My personal setup using BDR is 3 servers in a cluster and a 4th one as an alternate backup.
Most tenants connect to server1 with server3 as automatic failover destination via DNS settings.
One high use (tenant connects to server2 with server3 as the automatic failover.
I have backup scripts setup to keep server4 configured for the same data as what is in the cluster with a weeks worth of nightly backup versions.

Servers 1-3 are in various datacenters while server4 is in our local office. The cluster will handle normal networking or hosting provider issues while server4 is only used if "all hell breaks loose"
Thanks for the insights. How are you doing the automatic failover? You said DNS settings which could be a few different things.
 
Last edited: