New HA setup advice from standalone AWS hosted install

Status
Not open for further replies.

s2svoip

Member
Dec 9, 2019
255
7
18
44
Ive done a fair amount of reading on here and a few posts on other forums about setting fusion up in a HA config but most of the guides I have found are quite out of date.

I have a good handle on fusion, but only a basic knowledge of Linux – so this is where it gets a bit tough for me

I have a single fusion install running in AWS, but I am looking to setup another instance as a HA failover. Is this possible after you already have a standalone install running ?

From what I have read this would involve setting up a database sync with something like postgres 2nd quadrant, then a file sync for recordings / voicemails etc and then also a floating external IP that the primary machine owns but the failover can grab if the primary goes down – is this correct ?
  • Has anyone done this in AWS ? how you could reassign the external IP to another machine on the fly
  • Does anyone have any links to resources I could read about setting up the database & file sync
Any help would be greatly appreciated!
 

ad5ou

Active Member
Jun 12, 2018
884
196
43
High Availability is a fairly advanced subject and there are lots of ways to accomplish various degrees of HA.

Unless you are up for a lot of self learning, your best bet might be to attend the Advanced Fusionpbx training or at least watch the videos from previous classes. The videos and classes are available to paid Fusionpbx.com members. The other option is to pay someone to help you set it all up for you.

As I said, there are multiple ways to accomplish HA, but the method taught in the Advanced classes primarily focuses on "multi-master" database replication using Postgres BDR (bi directional replication), file syncing using "Syncthing" or your choice of file sync software, and options for fail over methods and optimization options.

A common approach for fail over is to use a "smart dns" such as Route53 to poll the server IP and change DNS routing to a secondary IP is service check fails.
 

s2svoip

Member
Dec 9, 2019
255
7
18
44
High Availability is a fairly advanced subject and there are lots of ways to accomplish various degrees of HA.

Unless you are up for a lot of self learning, your best bet might be to attend the Advanced Fusionpbx training or at least watch the videos from previous classes. The videos and classes are available to paid Fusionpbx.com members. The other option is to pay someone to help you set it all up for you.

As I said, there are multiple ways to accomplish HA, but the method taught in the Advanced classes primarily focuses on "multi-master" database replication using Postgres BDR (bi directional replication), file syncing using "Syncthing" or your choice of file sync software, and options for fail over methods and optimization options.

A common approach for fail over is to use a "smart dns" such as Route53 to poll the server IP and change DNS routing to a secondary IP is service check fails.

Thanks for the Advice, my end goal is to sign up for the membership, I want to support Mark with everything he has done, hard part is getting some income first to pay for that!
 

Adrian Fretwell

Well-Known Member
Aug 13, 2017
1,391
365
83
hard part is getting some income first to pay for that!
Classic chicken and egg situation!

The problem I see with "smart dns" is that there are a good number of endpoint out there that will not re-query the DNS until they are restarted. We changed IPs on a SIP proxy once, we ran the old and new IPs in parallel, but with DNS only pointing at the new, it was nearly a month before all the endpoints switched!

The method you describe of using a floating IP address can be achieved within a data centre using virtual router redundancy protocol (VRRP) each machine is assigned an IP and then a third IP is used to float between them, it is this IP the customers connect to. See https://www.keepalived.org/

I agree with @ad5ou about taking the membership or an advanced class. Having said that, you do tend to remember more when you have designed and figured it all out for yourself, it just takes a long time!
 

Adrian Fretwell

Well-Known Member
Aug 13, 2017
1,391
365
83
Which endpoints are these?
I don't know what they all were, this was on our SIP trunk platform and customers just put their own equipment on the trunk. The registration records are long gone. To be fair the issue may not have always been the fault of the endpoint, we have seen caching DNS servers that have not obeyed the TTL in the DNS record or forced their own TTL towards the endpoint. Customers IT departments can do a lot of odd things with their router / firewall configurations.
 

Adrian Fretwell

Well-Known Member
Aug 13, 2017
1,391
365
83
@DigitalDaz I have just done a nine and a half hour packet capture on a Yealink T21p E2. The phone has two accounts configured, one to our primary SIP server and one to a secondary SIP server (not failover). The servers are in different data centres and on different IP subnets. The TTL on our DNS for the SIP servers is 1800 (30 minutes).

In the packet capture I can see the OPTIONS pings going back and forth, I can see REGISTRATION every hour, I can see the phone refreshing it's DHCP, I can see it making STUN requests to our STUN server. I can also see it requesting a DNS lookup for uk.pool.ntp.org followed by the actual NTP request. But NOT ONCE has the phone requested a DNS lookup for the two SIP server addresses.

This situation may have been different if one of the SIP server had become unreachable, maybe I need to try that.

This does demonstrate that if I had changed the DNS for the SIP server the phone would not have switched to the new IP address during that time period.
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
I think I have seen this too but the yealinks have been flawless in failing over when the primary was unreachable. I had a real event a few weeks before xmas when I had a network card failure in a server.

What I did have though which surprised me was just a small number of Yealinks that didn't fail back until I blocked their access to the secondary. I didn't dig into it at the time but this leads me to think that this could well be a firmware version issue.
 
  • Like
Reactions: Adrian Fretwell

s2svoip

Member
Dec 9, 2019
255
7
18
44
Do the yealinks have the option to set a primary and secondary address for one SIP account ? I have not heard of this before

I am cautious to look into the DNS as a failover mechanism as per what Adrian mentions

the idea of just re-mapping a static IP to a sync'd instance would be almost instant.

@Adrian Fretwell I would prefer to learn this myself as its me that would be supporting it, as you say. do you have any pointers for resources I can read, it might be a while before I am able to spring for the membership
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
@s2svoip Yes, with new yealinks you can set both the primary and secondary and you pass the domain in the username. They then simultaneously are registered to both like so:

dualreg.png
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
It needs to be a newish firmware version. My provider can use DNS-SRV and respects TTL so that works perfectly too as does route53 DNS failover, I'm actually leaning to that more now as it works so well with my provider.
 

s2svoip

Member
Dec 9, 2019
255
7
18
44
Interesting, so this could at a stretch be used as a basic form of redundancy, if your SIP provider has both external addresses of your PBX's in the same endpoint group. nice info, thanks @DigitalDaz

my end goal would be to have 2 servers sync'ing with a floating external IP for seamless fail over but this might work in the interim, cloning my fusion install then making changes on both - a real hack job
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
What do you mean basic? This gives me spot on cross datacenter redundancy.
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
I was referring in my specific case - manually keeping the servers settings in sync as I dont know how to setup database replication etc yet

I can't give you step by step but just replicate the fusionpbx database with bdr and keep the needed voicemails/recordings/custom sounds in sync with either csync2 or syncthing.
 

DigitalDaz

Administrator
Staff member
Sep 29, 2016
3,038
556
113
Floating IP is only good for local datacenter unless you have BGP.
 

s2svoip

Member
Dec 9, 2019
255
7
18
44
@DigitalDaz is the secondary sip server able to be set in fusion in autoP somewhere ? I cant seam to see it

I got it working with a profile and custom variable in the CFG file, not sure if thats the best way but it works - cheers @Adrian Fretwell for how to do this!
 
Last edited:

Mikey

New Member
Feb 10, 2020
15
1
3
54
  • Does anyone have any links to resources I could read about setting up the database & file sync
Any help would be greatly appreciated!

Use an AWS EFS drive. Mount it on both servers to /var/lib/freeswitch/ and /usr/share/freeswitch/sounds/
This should cover most everything you need in terms of syncing files
 
  • Like
Reactions: s2svoip

s2svoip

Member
Dec 9, 2019
255
7
18
44
Use an AWS EFS drive. Mount it on both servers to /var/lib/freeswitch/ and /usr/share/freeswitch/sounds/
This should cover most everything you need in terms of syncing files

Now that is interesting, I was not even aware of that! I will have to do some reading and testing - cheers
 

InTeleSync

New Member
Feb 9, 2020
11
7
3
www.intelesync.com
Perhaps utilizing AWS Load Balancer along with AWS RDS (PostgreSQL) or Aurora could work which would take care of all the failover and replications needed. Haven't done it, but certainly looks like a path to explore.

1581383844402.png
 
  • Like
Reactions: s2svoip
Status
Not open for further replies.