Membership access to Multi-Master cluster documentation

itia

New Member
May 29, 2020
22
2
3
USA
Hi there. Looking at becoming a FusionPBX member, but I'm not yet certain the training videos will fully help me, so I was hoping to lay out what I am trying to do, and get feedback from the crowd and @DigitalDaz on whether the membership videos will actually cover this or not. Committing to $1200 in membership fees while you're still in the research/initial testing phase of a project is a tough sell to your leadership.

And as an aside, I'm getting really excited about FusionPBX/FreeSwitch's abilities compared to Asterisk/FreePBX. For one thing, Asterisk can only have one internet gateway/public IP address, requiring an OpenSIPS/Kamailio proxy or a Sangoma SBC in front to re-write header information (WHHHYYYYY???!!).

The company I work for supports critical infrastructure, so our support department needs as close to 100% uptime as possible. We've used FreePBX in master-slave HA for years, but have still had 5 outages a year (SIP provider outages, fat-finger configuration outages, issues with passive secondary server not actually being ready to take over as master, logs filling up harddrives, power outages, etc.etc.etc.) forcing me to go back to the drawing board and find a true multi-master, clustered phone system. I'm trying to keep it as simple as possible, while as resilient as possible, and cheap as possible.

I DO NOT need load-balancing or multi-tenancy, and don't want to complicated the system with OpenSIPS or Kamailio proxies. I want to contain the system to a minimum amount of software to learn, and to have support available if we get in trouble. Ideally reducing software footprint to FusionPBX, a database clustering solution, and SIPcapture for live monitoring, logging and call quality gathering for quick debugging of issues.

Majority of work force is external, some in-office, so a hybrid solution is desirable... one in a cloud provider (either AWS, GCP, Digital Ocean or Vultr), another on our own hardware in a colo datacenter, and another on hardware in-office. In office server would receive a dedicated fiber SIP line... I imagine another SIP Profile would be used by this, since it has its own subnet/gateway, etc... and three other SIP providers that all three FusionPBX nodes share (each verified as not dependent on each other). This configuration allows for cloud provider failure, local power outages, local internet outages... and even if there is a major internet backbone failure, remote workers will have access to the cloud node... or a fourth cloud node could be spun up in a different region. FusionPBX would be set to save configuration in a local database on each node, which would be multi-master replicated between them.

I was quite interested in Galera Clustering of MySQL instances on each of the FusionPBX nodes, but I saw mention on this forum that the FusionPBX developers plan to use PostgreSQL functions, thus making MySQL incompatible with FusionPBX at a later date... Is it true??? If so, it seems like everyone would have to use more complicated or expensive multi-master clustering solutions for PostgreSQL. Has anyone successfully used Bacardo async multi-master replication successfully? Or does it have to be synchronous?

I would also need inter-node calling (an endpoint registered to Node A can call another endpoint registered at Node B).

SO BACK TO THE QUESTION: Have any of you members experienced the Membership advanced training videos on this subject and can confirm that all of this will be covered?

Thanks in advance.
 
Last edited:

Adrian Fretwell

Active Member
Aug 13, 2017
667
160
43
I believe membership is essential if you are using FusionPBX for business with customers that are relying on you. It's not just about training videos, it provides detailed information needed when you perform upgrades etc.

If you are in doubt, ask @markjcrane directly or call the FusionPBX office and have a chat with Misty.
 

ewdpb

Member
Oct 3, 2019
151
18
18
@itia, I do not think you will get an answer to your question. I asked something similar a while ago directly to the office and I was basically told that I had to become a member to find out.
 

Adrian Fretwell

Active Member
Aug 13, 2017
667
160
43
@itia, I do not think you will get an answer to your question. I asked something similar a while ago directly to the office and I was basically told that I had to become a member to find out.
Whatever happened to community spirit? Sadly this is how many good projects shoot themselves in the foot.
 

ewdpb

Member
Oct 3, 2019
151
18
18
Whatever happened to community spirit? Sadly this is how many good projects shoot themselves in the foot.
I could not agree more. We ended up moving to use Audiocodes. I still use freeswitch/FusionPBX for my side projects/education because i like and enjoy them but we were not successful in getting enough confidence to put them in production.
 

Kenny Riley

Active Member
Nov 1, 2017
224
38
28
33
I am a longtime purple member and can tell you that the database clustering documentation in the member portal is based off of an older version of Postgres BDR. I know Mark and team are working towards another solution for database replication looking into the future, however, this is how it's currently taught and supported.

As far as passing calls between Node A and Node B -- If you are implementing a multi-master cluster then I would imagine you would be implementing a master/slave scenario, so I'm not sure why you would want to pass calls between them if one of your servers is a standby failover system. But to answer your question.. no, this is not covered in any current training or documentation. However, connecting two FusionPBX systems together to pass calls back and forth between them is very doable.

If you're looking for a multi-master cluster, then FusionPBX fits the bill perfectly there. I am using a 2 node cluster myself with about 500 registered endpoints, and it has been rock solid for about 3 years now. My current setup is similar to what you are envisioning: my primary server is a dedicated server hosted with OVH, and the backup server is hosted on physical hardware in our data center.
 
Last edited:

markjcrane

Active Member
Staff member
Jul 22, 2018
281
88
28
46
Reason for teaching the older BDR is because 2nd Quadrant is included in 2nd Quadrants support plan which costs a lot more than FusionPBX support. Some people will use the new BDR but many its priced too high. However the BDR 1.x version is end of life but still works good on Debian 9 and 10. Its life will run out entirely likely in Debian 11. We talk about a lot of things in the advanced training this is one element.

There are multiple other ways to do replication... You could also use PostgreSQL native primary and standby replication. PostgreSQL has documented how to do that style of replication in their documentation. There are various other ways to do it.

To original post Galera Cluster for MySQL I don't recommend this as some things will be broken. FusionPBX development is focused on PostgreSQL as it is time consuming to support multiple databases and no effort is going into making things work in MySQL so there are multiple things broken if it is used.

If you don't want to learn FusionPBX clustering you can always use FreePBX they last I checked they charge 1500.00 per node in the cluster and you get limited number of times you can fail over until you have to pay more money.

FusionPBX's next Advanced Training class will show an alternative to BDR.
 
Last edited:

itia

New Member
May 29, 2020
22
2
3
USA
@Kenny Riley thanks for your insight into this. I think Audiocodes will be outside our price range. But yes I’m looking at Multi-Master setup. I’m a member now and am a little miffed the documentation doesn’t cover inter-node calling. I would expect this to work out of the box. I’m loving a lot about the software... but a little worried about access to support.

@markjcrane thanks for the reply and insight into next BDR solution. Is there any chance of answering the question if I go to the next level of membership, can I quickly get an answer to why I can’t do inter-node calling with my cluster (which was set up exactly per the Advanced training videos)?

Also of note, neither documentation nor videos seem to answer the question of how domains should be set up in multi-master scenario. I’m assuming a single domain so you don’t have a domain on each node, and extensions dedicated to one domain... as that would defeat the purpose of a multi-master setup...

I can’t really close my company on upgrading membership and using the software if I can’t get answers to simple questions like this.

thanks.
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
2,542
421
83
@itia I cannot quite understand the question you are asking regarding "how domains should be set up in multi master scenario"
 

sudoRmRf

New Member
May 31, 2019
6
0
1
29
@itia I have been working on a cluster recently as well, and inter node calling is a goal of ours as well. Watch at Advanced Training March 2018 part 1 @markjcrane explains changing the freeswitch database to use BDR and the pros and cons of doing so.
 

itia

New Member
May 29, 2020
22
2
3
USA
@DigitalDaz Thank you for asking. What I meant is... the Advanced Training basically ends with Mark saying something like "Now this catches a lot of people up, but each node has its own hostname, (f1.fusionpbx.com, f2.fusionpbx.com, f3.fusionpbx.com) and you'll need to log in with admin@f1.fusionpbx.com as that is the only domain you have in Advanced / Domains." But that's all he says about it.

So now I can only divine that you should keep your domain called "f1.fusionpbx.com" or change it to something common amongst the cluster "f.fusionpbx.com". But does that domain name change cause problems?

Or is it recommended that each node has its own domain and set of users/extensions? I can only assume this would be the case since the inter-node calling isn't working by default.
 

markjcrane

Active Member
Staff member
Jul 22, 2018
281
88
28
46
Your description seems vague on what you want but I'm going to guess you want round robin across multiple servers for one domain and you are hoping that this will be a piece of cake. It is possible to do it but you will have numerous problems. So many issues that I would not recommend it. I share what I recommend in the class.
 

itia

New Member
May 29, 2020
22
2
3
USA
@markjcrane Thanks. Well here's a little more data. So all inbound calls would hit Node A in the office... and I would make all endpoints register to Node A by default... but lets say internet at the office goes down... endpoints inside the office can still use node A to call in/out the dedicated fiber SIP channels. But now we have endpoints external to the office going to Node B... inbound calls coming through the IVR, destined for endpoints on Node B can't be reached.

And what if all inbound calls couldn't reach Node A, so failed over to Node B... And now Node A's internet comes back up... we end up with endpoints splattered between Node A and Node B, which now half the endpoints can't call the other, and some may be mid outbound calls, so can't be en-mass de-registered. And even if they were de-registered, what if they decide to stick to Node B instead of Node A.

We have had some 7 outages in the last 12 months at our office (from internet, power and configuration issues on the PBX)... That is the whole reason I'm looking at having a multi-master with a node in office and two nodes out of office.

But it defeats the purpose of multi-master if inter-node calling doesn't work... I don't even really need a sofia recover of active calls. All that matters is if a dropped caller can start a new call in/out right away on a new node and reach all endpoints on any node.

I know you mentioned in Advanced class 2019 that having freeswitch active calls saved to the BDR cluster can get you into all kinds of trouble (if one node goes down, xlogs fill up fast on other nodes, with active call records that are being created and torn down rapidly, so is unnecessary waste). And I agree, its not really necessary for my situation. But inter-node calling is very important to me.

Is that a little more clear on why I need inter-node calling?
 

Kenny Riley

Active Member
Nov 1, 2017
224
38
28
33
Your entire scenario above revolves around the internet going down at the location where the PBX resides.

Here's a simple solution... Co locate it or go rent a dedicated server from a hosting provider. Want another server in a different data center to protect against one data center potentially going down? Get a 2nd one in a different region or with a different provider and configure fail over on the FusionPBX and carrier side of things. That's what we do.

You're trying to architect a solution that avoids the real problem here, which is your bad design. It's not very smart to place a phone system that multiple locations rely on within one of those locations and make them all reliant on the power or internet of that single location. You've essentially created a single point of failure for all of your locations. Not smart all..
 
Last edited:

itia

New Member
May 29, 2020
22
2
3
USA
@Kenny Riley Why do you think I'm setting up a single point of failure? I'm trying to have three nodes that can work together, with any endpoint connecting to any node and still getting inter-node calling. And every endpoint has a secondary proxy DNS entry that points to the remaining two nodes. It is not recommended to have only two nodes in a BDR cluster.

I also want to keep inter-office communication within the company network, if possible (thus the in-office node). And it would be nice if external phones could reach a cloud node closer to them for lower latency directly to SIP provider. (not vital, but seems it would be possible)

Why is this so hard for everyone to grasp? @sudoRmRf is needing the same setup.

@DigitalDaz Am I still not making sense on the domain setup? I still can't get an answer on this. The Advanced training leaves this incomplete.

@markjcrane I will up my membership level if inter-node calling is possible and just needs support help... but I'm not getting a lot of confidence here. I don't need some carrier class instant failover/recover. I just need endpoints to be able to register to a main node, and have two other nodes to fail over to.

Am I missing something blatantly obvious?
 

Kenny Riley

Active Member
Nov 1, 2017
224
38
28
33
I understand what you're trying to do. What I'm saying is, you're overcomplicating it. How many locations/phones are we talking about here?

Wouldn't life be a lot easier if you simply co locate or your servers or use hosted servers rather than worry about all of the issues that stem from having them on-site in branch locations with a BDR cluster setup? With your proposed topology, if the internet at the location of where node A resides were to ever go down, the phones at that location will still see the PBX online since they're local and won't fail over to another node which will cause all sorts of call routing issues. Hosting your servers off-site resolves this issue.

But thanks for the advice on what a recommended BDR setup is coming from someone who is asking questions on how to set one up to begin with. For what it's worth, it's not recommended to host your cluster nodes in office locations either.

What you're asking for (endpoints to register to a main nude and have two other nodes to fail over to) is readily available and taught in the advanced training.. but it's taught from the perspective of hosting your node's off-site either in your data center, or with a hosting provider, not in your office

Regarding your domain setup question.. it's not making any sense. If you have domain1.yourcompany.com, the FusionPBX database is replicated to your other nodes with a BDR setup.. there is no special "domain setup" required when using BDR -- all of your domains and extensions replicate to your other notes in real time.
 
Last edited:

itia

New Member
May 29, 2020
22
2
3
USA
@Kenny Riley Thank you for being very up front on this, and really challenging me... Despite my reactions, it is very much appreciated.

I understand what you're trying to do. What I'm saying is, you're overcomplicating it. How many locations/phones are we talking about here?

Wouldn't life be a lot easier if you simply co locate or your servers or use hosted servers rather than worry about all of the issues that stem from having them on-site in branch locations with a BDR cluster setup? With your proposed topology, if the internet at the location of where node A resides were to ever go down, the phones at that location will still see the PBX online since they're local and won't fail over to another node which will cause all sorts of call routing issues. Hosting your servers off-site resolves this issue.

Okay, I realize where the breakdown is here. What I failed to mention is that we are in contract for a few more years with a SIP provider delivering over dedicated fiber into the building. It is a blessing and curse... when the internet goes down, the SIP over fiber usually stays solid. This is one big reason to have a node in-office... otherwise yes... after thinking it through, I can see that having a node in the office really complicates failover. You can easily get a "split brain" with some endpoints on each node, and inbound calls only go to one server...

Only concern I have for having all in the Vultr cloud, is that I've seen an entire platform go down (remember the global AWS outage a few years back?)... which an in-office node would totally bypass.

What you're asking for (endpoints to register to a main node and have two other nodes to fail over to) is readily available and taught in the advanced training.. but it's taught from the perspective of hosting your node's off-site either in your data center, or with a hosting provider, not in your office

Agreed. What I wasn't thinking with was that if half your endpoints can't reach server1... then server2 probably can't reach server1 either... so inter-node calling is out the window anyways.

I personally haven't been able to get Grandstreams, Yealinks, Bria 5 or any other softphone to properly use NAPTR and SRV records... or even Primary and Secondary SIP servers (for simultaneous registrations)... They all want to handle it differently... so I guess I'm down to using the "proxy1" and "proxy2" settings for "on failure of server1, register to server2". and if I have a third node, its only there for a manual failover if its really needed.

Regarding your domain setup question.. it's not making any sense. If you have domain1.yourcompany.com, the FusionPBX database is replicated to your other nodes with a BDR setup.. there is no special "domain setup" required when using BDR -- all of your domains and extensions replicate to your other nodes in real time.

Yes, but are there advantages or disadvantages to leaving your domain as "domain1.yourcompany.com" for every node? or is it better to chose a domain name that doesn't match any individual node's hostname? (domain.yourcompany.com) Or is there no difference at all? This isn't covered in the Advanced training video. That's all I'm asking.