Performance tuning fusionpbx

Status
Not open for further replies.

minhtan1581

New Member
Mar 22, 2018
6
0
1
27
Hi all,
My Fusionpbx had an audio failure when it reached 700 session (350 call).
CPU 30% , RAM 30%, I/O a few
Call is connected but lost sone sound. Very uncomfortable
I think problem in UDP package or config jitter in freeswitch.
What can I do to handle this situation ???


1570851522244.png
 

Adrian Fretwell

Well-Known Member
Aug 13, 2017
1,412
376
83
This sort of issue can be very difficult to track down and may not be caused by a single factor alone. First of all, are you sure this issue is with the machine running FreeSwitch and not a bottleneck in a router or a misbehaving network switch somewhere?

If you are convinced the issue is within your machine, there is quite a lot to look at and, believe me, I'm no expert! The following link is a good place to start reading:
https://freeswitch.org/confluence/display/FREESWITCH/Performance+Testing+and+Configurations

When you start digging into it, you find that the Linux network stack is built with many ring buffers, for example, one near the physical layer spooling all the packets coming in, then another one in the kernel sifting the packets to the correct listening ports, then another on the specific port itself servicing the application, this complexity makes for an interesting time. Beware of making changes to the default UDP buffer sizes, unless you know what you are doing, it can often make things worse!

I tend to monitor my UDP performance by looking at /proc/net/udp (cat /proc/net/udp). This lists all open UDP sockets and shows the outbound and inbound queues in bytes. The rx_queue and tx_queue columns are the ones I look at, most of the time they will show zero or at least a fairly low value if the queues are being service properly and there is no bottleneck. Here is a screen shot from one of my boxes this morning, it's not loaded very much as it is the weekend:

Code:
root@a2es-vox-82:~# cat /proc/net/udp
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode ref pointer drops           
  284: 29A7C2B2:B23D 08080808:0035 01 00000000:00000000 00:00000000 00000000    33        0 19379 2 ffff8803e3999440 0       
  615: 29A7C2B2:13D3 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 19373 2 ffff8803e43b0880 0       
1765: 29A7C2B2:7856 00000000:0000 07 00000000:00001380 00:00000000 00000000    33        0 137157082 2 ffff8803bd8bec00 0   
1766: 29A7C2B2:7857 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 137157085 2 ffff8803bd8be880 0   
3371: 29A7C2B2:7E9C 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 118966339 2 ffff8803bc348b80 0   
3372: 29A7C2B2:7E9D 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 118966342 2 ffff8803bc348800 0   
3838: 00000000:006F 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 8092 2 ffff8803e3999b40 0       
3850: 29A7C2B2:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13203 2 ffff8803e49b0180 0       
3850: 01000022:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13202 2 ffff8803e49b0500 0       
3850: 00000000:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13196 2 ffff8803e49b0c00 0       
4458: 00000000:02DB 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 8095 2 ffff8803e31991c0 0       
4469: 01000022:02E6 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 16612 2 ffff8800eae22b80 0       
5631: 00000000:E770 00000000:0000 07 00000000:00000000 00:00000000 00000000   106        0 1524 2 ffff8800eafa0bc0 0       
5735: 29A7C2B2:27D8 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 19411 2 ffff8800eae27800 85     
6073: 29A7C2B2:692A 00000000:0000 07 00000000:00001380 00:00000000 00000000    33        0 137159817 2 ffff8803c37cb3c0 0   
6746: 01000022:2BCB 00000000:0000 07 00000000:00000000 00:00000000 00000000   108        0 1856 2 ffff8800eafa0840 0

As you can see most of the queues show zero with a couple showing 1380, I never see the figure go over 1380 and I'm not sure what the 1380 actually is because, although my RTP packets vary in size they are all much smaller than 1380 bytes.

netstat -lunp will show a similar listing. I also periodically look at netstat -su and check the RcvbufErrors.

RcvbufErrors will generally have some value in there but I assess it as a percentage of the total packet received. I have seen the RcvbufErrors count increase for non performance related reasons such as a device continuing to send RTP after the connection has been torn down.

I'm still trying to figure out the best way monitor performance, so I welcome any ideas, obviously we need to monitor performance before we can tune it.

There is a huge blog post called Monitoring and Tuning the Linux Networking Stack: Receiving Data, it's heavy going, I include the link below:

https://blog.packagecloud.io/eng/2016/06/22/monitoring-tuning-linux-networking-stack-receiving-data/
 
Last edited:

minhtan1581

New Member
Mar 22, 2018
6
0
1
27
This sort of issue can be very difficult to track down and may not be caused by a single factor alone. First of all, are you sure this issue is with the machine running FreeSwitch and not a bottleneck in a router or a misbehaving network switch somewhere?

If you are convinced the issue is within your machine, there is quite a lot to look at and, believe me, I'm no expert! The following link is a good place to start reading:
https://freeswitch.org/confluence/display/FREESWITCH/Performance+Testing+and+Configurations

When you start digging into it, you find that the Linux network stack is built with many ring buffers, for example, one near the physical layer spooling all the packets coming in, then another one in the kernel sifting the packets to the correct listening ports, then another on the specific port itself servicing the application, this complexity makes for an interesting time. Beware of making changes to the default UDP buffer sizes, unless you know what you are doing, it can often make things worse!

I tend to monitor my UDP performance by looking at /proc/net/udp (cat /proc/net/udp). This lists all open UDP sockets and shows the outbound and inbound queues in bytes. The rx_queue and tx_queue columns are the ones I look at, most of the time they will show zero or at least a fairly low value if the queues are being service properly and there is no bottleneck. Here is a screen shot from one of my boxes this morning, it's not loaded very much as it is the weekend:

Code:
root@a2es-vox-82:~# cat /proc/net/udp
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode ref pointer drops         
  284: 29A7C2B2:B23D 08080808:0035 01 00000000:00000000 00:00000000 00000000    33        0 19379 2 ffff8803e3999440 0     
  615: 29A7C2B2:13D3 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 19373 2 ffff8803e43b0880 0     
1765: 29A7C2B2:7856 00000000:0000 07 00000000:00001380 00:00000000 00000000    33        0 137157082 2 ffff8803bd8bec00 0 
1766: 29A7C2B2:7857 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 137157085 2 ffff8803bd8be880 0 
3371: 29A7C2B2:7E9C 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 118966339 2 ffff8803bc348b80 0 
3372: 29A7C2B2:7E9D 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 118966342 2 ffff8803bc348800 0 
3838: 00000000:006F 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 8092 2 ffff8803e3999b40 0     
3850: 29A7C2B2:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13203 2 ffff8803e49b0180 0     
3850: 01000022:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13202 2 ffff8803e49b0500 0     
3850: 00000000:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 13196 2 ffff8803e49b0c00 0     
4458: 00000000:02DB 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 8095 2 ffff8803e31991c0 0     
4469: 01000022:02E6 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 16612 2 ffff8800eae22b80 0     
5631: 00000000:E770 00000000:0000 07 00000000:00000000 00:00000000 00000000   106        0 1524 2 ffff8800eafa0bc0 0     
5735: 29A7C2B2:27D8 00000000:0000 07 00000000:00000000 00:00000000 00000000    33        0 19411 2 ffff8800eae27800 85   
6073: 29A7C2B2:692A 00000000:0000 07 00000000:00001380 00:00000000 00000000    33        0 137159817 2 ffff8803c37cb3c0 0 
6746: 01000022:2BCB 00000000:0000 07 00000000:00000000 00:00000000 00000000   108        0 1856 2 ffff8800eafa0840 0

As you can see most of the queues show zero with a couple showing 1380, I never see the figure go over 1380 and I'm not sure what the 1380 actually is because, although my RTP packets vary in size they are all much smaller than 1380 bytes.

netstat -lunp will show a similar listing. I also periodically look at netstat -su and check the RcvbufErrors.

RcvbufErrors will generally have some value in there but I assess it as a percentage of the total packet received. I have seen the RcvbufErrors count increase for non performance related reasons such as a device continuing to send RTP after the connection has been torn down.

I'm still trying to figure out the best way monitor performance, so I welcome any ideas, obviously we need to monitor performance before we can tune it.

There is a huge blog post called Monitoring and Tuning the Linux Networking Stack: Receiving Data, it's heavy going, I include the link below:

https://blog.packagecloud.io/eng/2016/06/22/monitoring-tuning-linux-networking-stack-receiving-data/

Many thanks for your reply!! @Adrian Fretwell

Your share is extremely useful, to be able to optimize systems that require a lot of experience including systems, networks, ... and a logical thought.
I think I need to learn more before I can start to optimize system
 
Status
Not open for further replies.