IBM Watson integration

Status
Not open for further replies.

bcmike

Active Member
Jun 7, 2018
326
54
28
53
OK, so after a week of learning lua and hacking and slashing, I have a working script for IBM Watson speech to text voicemail transcription for Fusion PBX

Honestly its not that hard but I didn't know how to script in lua so it was painful. Anyway here goes:

First go get an acocount and API key from the IBM Watson speech to text site here: https://cloud.ibm.com/login

Next in the Fusion PBX GUI got to Advanced => Default Settings and scroll down to the voicemail section.

Enter the following new keys:

Sub: api_key Type: text Value: [whatever your api keys ] Enabled: true
Sub: json_enabled Type: boolean Value: true Enabled: true
Sub: transcibe_language Type: text Value: en-US Enabled: true
Sub: transcribe_provider Type: text Value: watson Enabled: true
Sub: transcription_server type: text Value: https://stream.watsonplatform.net/speech-to-text/api/v1/recognize

Now at the top of the page click the "reload" button. Then navigate to Status => Sip Status and click the "Flush Cache" button and "Reload Xml" button
Also make sure transcription is set to true in your voicemail box

Make a backup of /usr/share/freeswitch/scripts/app/voicemail/resources/functions/record_message.lua by renaming it.
Uplaod the file attached and make sure the ownership and permissions are correct.

You can also just modify the current record_message.lua by inserting the relevant bits in to the transcribe function, as below:

Code:
local function transcribe(file_path,settings,start_epoch)
                --transcription variables
                if (os.time() - start_epoch > 2) then
                        local transcribe_provider = settings:get('voicemail', 'transcribe_provider', 'text') or '';
                        transcribe_language = settings:get('voicemail', 'transcribe_language', 'text') or 'en-US';

                        if (debug["info"]) then
                                freeswitch.consoleLog("notice", "[voicemail] transcribe_provider: " .. transcribe_provider .. "\n");
                                freeswitch.consoleLog("notice", "[voicemail] transcribe_language: " .. transcribe_language .. "\n");

                        end
           
            -- Watson Stuff --

                        if (transcribe_provider == "watson") then
                                local api_key = settings:get('voicemail', 'api_key', 'text') or '';
                                local transcription_server = settings:get('voicemail', 'transcription_server', 'text') or '';
                                if (api_key ~= '') then
                                        transcribe_cmd = [[ curl -X POST -u "apikey:]]..api_key..[[" --header "Content-type: audio/wav" --data-binary @]]..file_path..[[ "]]..transcription_server..[[" ]]
                                        local handle = io.popen(transcribe_cmd);
                                        local transcribe_result = handle:read("*a");
                                        handle:close();
                                        if (debug["info"]) then
                                                freeswitch.consoleLog("notice", "[voicemail] CMD: " .. transcribe_cmd .. "\n");
                                                freeswitch.consoleLog("notice", "[voicemail] RESULT: " .. transcribe_result .. "\n");
                                        end

                                        local transcribe_json = JSON.decode(transcribe_result);
                                        if (debug["info"]) then
                                                if (transcribe_json == nil) then
                                                        freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: (null) \n");
                                                else
                                                        freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: " .. transcribe_json["results"][1]["alternatives"][1]["transcript"] .. "\n");
                                                end
                                                if (transcribe_json["results"][1]["alternatives"][1]["confidence"] == nil) then
                                                        freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: (null) \n");
                                                else
                                                        freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: " .. transcribe_json["results"][1]["alternatives"][1]["confidence"] .. "\n");
                                                end
                                        end

                                        transcription = transcribe_json["results"][1]["alternatives"][1]["transcript"];
                                        confidence = transcribe_json["results"][1]["alternatives"][1]["confidence"];
                                        return transcription;
                                end
                        end
                else
                        if (debug["info"]) then
                                freeswitch.consoleLog("notice", "[voicemail] message too short for transcription.\n");
                        end
                end

                return '';
        end

Now we just have to see how much Watson will cost when the rubber hits the road, but so far the accuracy has been very good.
 

Attachments

  • record_message.watson.lua
    24.5 KB · Views: 14
  • Like
Reactions: KonradSC

Dan

Member
Jul 23, 2017
69
12
8
34
Any chance I could encourage you to make a pull request with your additions? There is value in having these types of additions in FusionPBX rather than as an out of tree patch :3
 
  • Like
Reactions: yukon

bcmike

Active Member
Jun 7, 2018
326
54
28
53
I would absolutely do that if I knew what you were talking about.
 

KonradSC

Active Member
Mar 10, 2017
166
98
28
@Dan Nice Work!! This has been on my To Do list for a while :)

I took the liberty and made some updates to your code.

First, I had to use IBM's narrowband model. It could be related to using PCMU as the codec. That required me to set this in Default Settings...

Sub: transcription_server
type: text
Value: https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?model=en-US_NarrowbandModel

For the LUA portion I added a few error checks for things like the Watson not responding, the API key breaking or IBM returning non-json data. I also found that if the recording has pauses it will break up the results, so I added a loop to concatenate all these.

Code:
--Watson
                        if (transcribe_provider == "watson") then
                                local api_key = settings:get('voicemail', 'api_key', 'text') or '';
                                local transcription_server = settings:get('voicemail', 'transcription_server', 'text') or '';
                                if (api_key ~= '') then
                                        transcribe_cmd = [[ curl -X POST -u "apikey:]]..api_key..[[" --header "Content-type: audio/wav" --data-binary @]]..file_path..[[ "]]..transcription_server..[[" ]]
                                        local handle = io.popen(transcribe_cmd);
                                        local transcribe_result = handle:read("*a");
                                        handle:close();
                                        if (debug["info"]) then
                                                freeswitch.consoleLog("notice", "[voicemail] CMD: " .. transcribe_cmd .. "\n");
                                                freeswitch.consoleLog("notice", "[voicemail] RESULT: " .. transcribe_result .. "\n");
                                        end
                                        
                                    --Trancribe request can fail
                                        if (transcribe_result == '') then
                                            freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: (null) \n");
                                            return ''
                                        else
                                            status, transcribe_json = pcall(JSON.decode, transcribe_result);

                                               if not status then
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] error decoding watson json\n");
                                                end
                                                return '';
                                            end
                                        end

                                    
                                    if (transcribe_json["results"] ~= nil) then
                                        --Transcription   
                                            if (transcribe_json["results"][1]["alternatives"][1]["transcript"] ~= nil) then
                                                transcription = '';
                                                for key, row in pairs(transcribe_json["results"]) do
                                                    transcription = transcription .. row["alternatives"][1]["transcript"];
                                                end
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: " .. transcription .. "\n");
                                                end
                                            else
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: (null) \n");
                                                end
                                                return '';
                                            end
                                        --Confidence
                                            if (transcribe_json["results"][1]["alternatives"][1]["confidence"]) then
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: " .. transcribe_json["results"][1]["alternatives"][1]["confidence"] .. "\n");
                                                end                                           
                                                confidence = transcribe_json["results"][1]["alternatives"][1]["confidence"];
                                            else
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: (null) \n");
                                                end
                                            end
                                            
                                            return transcription;
                                    else
                                        if (debug["info"]) then
                                            freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: json error \n");
                                        end
                                        return '';   
                                    end
                                end
                        end
--


I'll continue to test this on my end. I can assist you in submitting this code to the Fusionpbx project on github, or if you don't care I can submit it for you can give you a shout out!
 

bcmike

Active Member
Jun 7, 2018
326
54
28
53
Wow! Awesome!

Yes I was using G722 so that's probably why narrowband was needed for G711.

Yes please submit for me, I based it on your azure script anyway, LOL!
 

KonradSC

Active Member
Mar 10, 2017
166
98
28
I can't take credit for the Azure stuff. I think I just wrote some bug fixes :)

Good to know about the G722 vs G711. Let me see if we can include some codec detection so this doesn't break.
 

bcmike

Active Member
Jun 7, 2018
326
54
28
53
Confirmed, I tried my original script from a G711 phone and it failed
 

bcmike

Active Member
Jun 7, 2018
326
54
28
53
Sometimes Watson will return an % which the send_email script will not properly escape causing this:

5c980fc2-3014-4ae0-9e6e-12807ec96998 2019-07-29 17:36:18.530976 [NOTICE] sofia.c:1079 Hangup sofia/LAN/6043432757@10.1.5.115 [CS_EXECUTE] [NORMAL_CLEARING]
5c980fc2-3014-4ae0-9e6e-12807ec96998 2019-07-29 17:36:18.570898 [ERR] mod_lua.cpp:202 ...scripts/app/voicemail/resources/functions/send_email.lua:183: invalid use of % in replacement string
5c980fc2-3014-4ae0-9e6e-12807ec96998 stack traceback:
5c980fc2-3014-4ae0-9e6e-12807ec96998 [C]: in function gsub
5c980fc2-3014-4ae0-9e6e-12807ec96998 ...scripts/app/voicemail/resources/functions/send_email.lua:183: in function send_email
5c980fc2-3014-4ae0-9e6e-12807ec96998 /usr/share/freeswitch/scripts/app/voicemail/index.lua:588: in main chunk
5c980fc2-3014-4ae0-9e6e-12807ec96998 /usr/share/freeswitch/scripts/app.lua:48: in main chunk

It won't crash the voicemail routine but it won't deliver the email, which is a pain if its a stand alone vm with no mwi.

Looking through the script for a solution.
 

KonradSC

Active Member
Mar 10, 2017
166
98
28
Makes sense. I'll try to replicate and come up with a work around. It might be a day or two though..
 

bcmike

Active Member
Jun 7, 2018
326
54
28
53
Makes sense. I'll try to replicate and come up with a work around. It might be a day or two though..

It's hard to replicate as Watson is sending it as some sort of hesitation detection code, see example below:

hi Mike it's Tracy Murphy speaking on awhile hope you're doing well %HESITATION not sure what happened my phone but if you can reset it when you have a moment that would be great and then maybe send me an email %HESITATION

I've seen it happen in 2 messages though in the last 48 hrs so its a problem
 

KonradSC

Active Member
Mar 10, 2017
166
98
28
I fixed the error by adding another gsub line in send_email.lua.

We swap out a percent sign with an asterisk.

Code:
                    if (transcription ~= nil) then
                        transcription = transcription:gsub("%%", "*");
                        body = body:gsub("${message_text}", transcription);
                    end


Everything else working OK with Watson? If so, I'll submit the PR to source. Thanks for testing ;)

1564603917326.png
 

bcmike

Active Member
Jun 7, 2018
326
54
28
53
Lol, I'm not production, but I put my own phone on it because there's not test like eating your own dog food.

Everything looks OK, I was able to reproduce by going "uhhhh" in the test message.

this is a test *HESITATION this is a test *HESITATION yeah *HESITATION *HESITATION attest

Unless Watson throws more strange characters we should be ok.
 
Last edited:

bcmike

Active Member
Jun 7, 2018
326
54
28
53
I haven't upgraded yet and this is a noob question, but I was editing : /usr/share/freeswitch/scripts/app/voicemail/resources/functions/record_message.lua. I made a change to my voicemail box in the gui and all my Watson edits got nuked.

Should I be editing in a different location?
 
Status
Not open for further replies.