[Tutorial] Creating a two node FusionPBX cluster the easy way.

Status
Not open for further replies.

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
Just to add to this, Mark has made it now so that creating a cluster now really is a breeze.
 

Jean B

New Member
Hello,

So just to be up to date on my note since I tested DigitalDaz HA script last year, An HA script is available if we are a member on FusionPBX website? I'm really looking forward to being a member within a few days if I can get confirmation about an HA script or installation tutorial being available!

We will be deploying FusionPBX surely so HA is a must for us if we can transfer away from our current software.

Thanks in advance for an answer!
 

markjcrane

Active Member
Staff member
Jul 22, 2018
447
162
43
49
Yes the FusionPBX members get access to videos of previous training for Admin and Advanced classes and the documentation that goes with those. That includes instructions for multi-master replication for the database and file system and many other things to scale and provide a great experience to your users.
 
  • Like
Reactions: krooney and Jean B

Edson

Member
Aug 1, 2017
59
4
8
46
Hello guys,

I have an issue. After a master disaster recovery, data can be write/synced in just one direction Slave to Master. All the changes made in Master database will not be replicated to Slave. What can be done to restore it? Regards!
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
Nothing at all. With BDR, last write wins, so if they have resynced, the state you have now is the final one.
 

Edson

Member
Aug 1, 2017
59
4
8
46
Nothing at all. With BDR, last write wins, so if they have resynced, the state you have now is the final one.
Thanks a lot!
I was able to restore Master and Slave to the state before Master disaster. But now i am facing some freeswitch data conflict on Slave postgresql Logs. what can be done to restore this Freeswitch data sync? Regards!

2019-05-12 21:07:18 UTC [17488-852] [unknown]@freeswitch CONTEXT: apply UPDATE from remote relation public.sip_registrations in commit 8/D1048E90, xid 6593672 commited at 2019-05-12 21:07:18.560747+00 (action #14) from node (6593351633762540988,1,16386)
2019-05-12 21:07:18 UTC [17488-853] [unknown]@freeswitch LOG: CONFLICT: remote UPDATE: could not find existing row. Resolution: skip_change; PKEY:
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
Clearly, you haven't restored exactly and now you are in a worse position with a conflict, you shouldnever restore over BDR in my opinion.

I you have a good copy of the data I would now tear down the BDR cluster and restart from scratch and then restore your data to one of them.
 
  • Like
Reactions: krooney

Edson

Member
Aug 1, 2017
59
4
8
46
Ok, thanks a lot!! I will try to tear down only the freeswitch database bdr cluster. Do you have any recommended procedure to do it?
Regards!
 
Last edited:

smn

Member
Jul 18, 2017
201
20
18
Are there any advantages to using BDR as opposed to just doing a backup/restore daily from primary to secondary? There is a public script for that. Works just fine and keeps things simple. The only gotcha I have found so far is that CDR calls on the secondary will be lost the next time the DB it is restored from the primary. So I make backups of that. Not a big deal because the secondary is strictly a failsafe that will only be used rarely.

I am using PostgreSQL v11. I would rather stick with that moving forward rather than having to use 9.4.
 
Last edited:

markjcrane

Active Member
Staff member
Jul 22, 2018
447
162
43
49
Yes the advantage for multi-master is scale-ability. It means you can handle more work than a single server can handle. And also instant failover with no loss of information or very little loss of information depending on latency between the servers. Compared to backup and restore script that I wrote and shared.

"The only gotcha I have found so far is that CDR calls on the secondary will be lost the next time the DB it is restored from the primary."
Think about this more and you will find more issues. The larger you grow the more these issues will grow. Things like call forward and follow me, voicemail greetings and messages, fax, provisioning changes and a lot more depending on features that are used.

To your second point I'm working on a BDR replacement so we can use latest PostgreSQL.
 
Last edited:

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
Unless you understand every single piece of the replication you are using very well and are confident to troubleshoot it you should be replicating only using the fusionpbx taught methods.
 

smn

Member
Jul 18, 2017
201
20
18
Unless you understand every single piece of the replication you are using very well and are confident to troubleshoot it you should be replicating only using the fusionpbx taught methods.

You have been saying that your procedure is for failover only, not load balancing where both are used at the same time. Mark is describing a multi-master setup that implies load balancing. So I am a little confused what you are recommending now.

I'm still hoping someone can answer my question if backup/restore primary to secondary will work for a simple failover pair. Not load balancing, just for hot standby. I could add multi-master replication if/when it makes sense. I can get it to switch over within a minute or two using DNS failover so I think that is good enough for what I have in mind. Not perfect but relatively simple.
 
Last edited:

ad5ou

Active Member
Jun 12, 2018
884
195
43
Your backup/restore method will work for basic failover.
Mark's method is better for fail over because less data is lost compared to a nightly backup and frequently changed settings are usually "up to date" virtually instantly. Depending on the fail over method used and phones in use, a clustered system can be set to allow for basically zero down time.

The cluster setup allows distributing the load across servers by dividing you customer domains between nodes of the cluster.
Example: host1.domain.com, host2.domain.com, host3.domain.com
customer1.domain.com on host1.domain.com failover to host2.domain.com
customer2.domain.com on host2.domain.com failover to host3.domain.com
customer3.domain.com on host3.domain.com failover to host1.domain.com
or really any variation of customer domains routing to any host as primary and any other host as the failover.

As a superadmin, any tenant can be administered from any other online hosts.

My personal setup using BDR is 3 servers in a cluster and a 4th one as an alternate backup.
Most tenants connect to server1 with server3 as automatic failover destination via DNS settings.
One high use (tenant connects to server2 with server3 as the automatic failover.
I have backup scripts setup to keep server4 configured for the same data as what is in the cluster with a weeks worth of nightly backup versions.

Servers 1-3 are in various datacenters while server4 is in our local office. The cluster will handle normal networking or hosting provider issues while server4 is only used if "all hell breaks loose"
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
You have been saying that your procedure is for failover only, not load balancing where both are used at the same time. Mark is describing a multi-master setup that implies load balancing. So I am a little confused what you are recommending now.

I was talking active/passive using pairs. There is little point trying to load balance a pair because if one goes down, the secondary has to support the whole load anyway.
 

smn

Member
Jul 18, 2017
201
20
18
Your backup/restore method will work for basic failover.
Mark's method is better for fail over because less data is lost compared to a nightly backup and frequently changed settings are usually "up to date" virtually instantly. Depending on the fail over method used and phones in use, a clustered system can be set to allow for basically zero down time.

The cluster setup allows distributing the load across servers by dividing you customer domains between nodes of the cluster.
Example: host1.domain.com, host2.domain.com, host3.domain.com
customer1.domain.com on host1.domain.com failover to host2.domain.com
customer2.domain.com on host2.domain.com failover to host3.domain.com
customer3.domain.com on host3.domain.com failover to host1.domain.com
or really any variation of customer domains routing to any host as primary and any other host as the failover.

As a superadmin, any tenant can be administered from any other online hosts.

My personal setup using BDR is 3 servers in a cluster and a 4th one as an alternate backup.
Most tenants connect to server1 with server3 as automatic failover destination via DNS settings.
One high use (tenant connects to server2 with server3 as the automatic failover.
I have backup scripts setup to keep server4 configured for the same data as what is in the cluster with a weeks worth of nightly backup versions.

Servers 1-3 are in various datacenters while server4 is in our local office. The cluster will handle normal networking or hosting provider issues while server4 is only used if "all hell breaks loose"

Thanks for the insights. How are you doing the automatic failover? You said DNS settings which could be a few different things.
 
Last edited:

Tiensicum

New Member
Sep 26, 2019
7
0
1
56
Vietnam
www.inext.com.vn
Hello - I did install 02 nodes (Freeswitch, FusionPBX) connect to 02 nodes PostgreSQL-BDR. Because all node deployed on GCP, I am forced to specify external ip instead of "auto" in bind_server_ip, external_rtp_ip and external_sip_ip. Everything is fine if not config Freeswitch HA, there's a problem when put 02 external IPs address into Advanced\Variables menu of FusionPBX (note: FusionPBX replicated on PostgreSQL database). I don't know how to do that. Is there any known how to fix this problem for me? Thanks!
 
Last edited:

stephen doan

New Member
Aug 20, 2020
1
0
1
30
Hello guys,

I have a issue (as picture).
I got a few error "invalid profile external, internal, .."
What can be done to fix it?

Regards,
 

Attachments

  • 1599710393240.png
    1599710393240.png
    49.2 KB · Views: 43

rnpsh19

New Member
Jan 27, 2019
23
2
3
43
I was in the same boat looking for instructions, fiddling and testing with everything I could find. Become a member of Fusionpbx and it is one of the first things you can learn from the docs and videos. Once learned, it is ridiculously easy and you can move onto better more annoying problems. Probably not the answer you are looking for, but it is the fastest solution to your problem.

If that doesn't work, my experience you can still use this script with 4.4, then upgrade them to current. I did this when I was learning and it worked fine. Then again if you are worried about clusters for redundancy, then "fine" probably will not cut it.
 
Status
Not open for further replies.