IBM Watson integration

Status
Not open for further replies.

bcmike

Active Member
Jun 7, 2018
339
58
28
54
OK, so after a week of learning lua and hacking and slashing, I have a working script for IBM Watson speech to text voicemail transcription for Fusion PBX

Honestly its not that hard but I didn't know how to script in lua so it was painful. Anyway here goes:

First go get an acocount and API key from the IBM Watson speech to text site here: https://cloud.ibm.com/login

Next in the Fusion PBX GUI got to Advanced => Default Settings and scroll down to the voicemail section.

Enter the following new keys:

Sub: api_key Type: text Value: [whatever your api keys ] Enabled: true
Sub: json_enabled Type: boolean Value: true Enabled: true
Sub: transcibe_language Type: text Value: en-US Enabled: true
Sub: transcribe_provider Type: text Value: watson Enabled: true
Sub: transcription_server type: text Value: https://stream.watsonplatform.net/speech-to-text/api/v1/recognize

Now at the top of the page click the "reload" button. Then navigate to Status => Sip Status and click the "Flush Cache" button and "Reload Xml" button
Also make sure transcription is set to true in your voicemail box

Make a backup of /usr/share/freeswitch/scripts/app/voicemail/resources/functions/record_message.lua by renaming it.
Uplaod the file attached and make sure the ownership and permissions are correct.

You can also just modify the current record_message.lua by inserting the relevant bits in to the transcribe function, as below:

Code:
local function transcribe(file_path,settings,start_epoch)
                --transcription variables
                if (os.time() - start_epoch > 2) then
                        local transcribe_provider = settings:get('voicemail', 'transcribe_provider', 'text') or '';
                        transcribe_language = settings:get('voicemail', 'transcribe_language', 'text') or 'en-US';

                        if (debug["info"]) then
                                freeswitch.consoleLog("notice", "[voicemail] transcribe_provider: " .. transcribe_provider .. "\n");
                                freeswitch.consoleLog("notice", "[voicemail] transcribe_language: " .. transcribe_language .. "\n");

                        end
           
            -- Watson Stuff --

                        if (transcribe_provider == "watson") then
                                local api_key = settings:get('voicemail', 'api_key', 'text') or '';
                                local transcription_server = settings:get('voicemail', 'transcription_server', 'text') or '';
                                if (api_key ~= '') then
                                        transcribe_cmd = [[ curl -X POST -u "apikey:]]..api_key..[[" --header "Content-type: audio/wav" --data-binary @]]..file_path..[[ "]]..transcription_server..[[" ]]
                                        local handle = io.popen(transcribe_cmd);
                                        local transcribe_result = handle:read("*a");
                                        handle:close();
                                        if (debug["info"]) then
                                                freeswitch.consoleLog("notice", "[voicemail] CMD: " .. transcribe_cmd .. "\n");
                                                freeswitch.consoleLog("notice", "[voicemail] RESULT: " .. transcribe_result .. "\n");
                                        end

                                        local transcribe_json = JSON.decode(transcribe_result);
                                        if (debug["info"]) then
                                                if (transcribe_json == nil) then
                                                        freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: (null) \n");
                                                else
                                                        freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: " .. transcribe_json["results"][1]["alternatives"][1]["transcript"] .. "\n");
                                                end
                                                if (transcribe_json["results"][1]["alternatives"][1]["confidence"] == nil) then
                                                        freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: (null) \n");
                                                else
                                                        freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: " .. transcribe_json["results"][1]["alternatives"][1]["confidence"] .. "\n");
                                                end
                                        end

                                        transcription = transcribe_json["results"][1]["alternatives"][1]["transcript"];
                                        confidence = transcribe_json["results"][1]["alternatives"][1]["confidence"];
                                        return transcription;
                                end
                        end
                else
                        if (debug["info"]) then
                                freeswitch.consoleLog("notice", "[voicemail] message too short for transcription.\n");
                        end
                end

                return '';
        end

Now we just have to see how much Watson will cost when the rubber hits the road, but so far the accuracy has been very good.
 

Attachments

  • Like
Reactions: KonradSC
Any chance I could encourage you to make a pull request with your additions? There is value in having these types of additions in FusionPBX rather than as an out of tree patch :3
 
  • Like
Reactions: yukon
@Dan Nice Work!! This has been on my To Do list for a while :)

I took the liberty and made some updates to your code.

First, I had to use IBM's narrowband model. It could be related to using PCMU as the codec. That required me to set this in Default Settings...

Sub: transcription_server
type: text
Value: https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?model=en-US_NarrowbandModel

For the LUA portion I added a few error checks for things like the Watson not responding, the API key breaking or IBM returning non-json data. I also found that if the recording has pauses it will break up the results, so I added a loop to concatenate all these.

Code:
--Watson
                        if (transcribe_provider == "watson") then
                                local api_key = settings:get('voicemail', 'api_key', 'text') or '';
                                local transcription_server = settings:get('voicemail', 'transcription_server', 'text') or '';
                                if (api_key ~= '') then
                                        transcribe_cmd = [[ curl -X POST -u "apikey:]]..api_key..[[" --header "Content-type: audio/wav" --data-binary @]]..file_path..[[ "]]..transcription_server..[[" ]]
                                        local handle = io.popen(transcribe_cmd);
                                        local transcribe_result = handle:read("*a");
                                        handle:close();
                                        if (debug["info"]) then
                                                freeswitch.consoleLog("notice", "[voicemail] CMD: " .. transcribe_cmd .. "\n");
                                                freeswitch.consoleLog("notice", "[voicemail] RESULT: " .. transcribe_result .. "\n");
                                        end
                                        
                                    --Trancribe request can fail
                                        if (transcribe_result == '') then
                                            freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: (null) \n");
                                            return ''
                                        else
                                            status, transcribe_json = pcall(JSON.decode, transcribe_result);

                                               if not status then
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] error decoding watson json\n");
                                                end
                                                return '';
                                            end
                                        end

                                    
                                    if (transcribe_json["results"] ~= nil) then
                                        --Transcription   
                                            if (transcribe_json["results"][1]["alternatives"][1]["transcript"] ~= nil) then
                                                transcription = '';
                                                for key, row in pairs(transcribe_json["results"]) do
                                                    transcription = transcription .. row["alternatives"][1]["transcript"];
                                                end
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: " .. transcription .. "\n");
                                                end
                                            else
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: (null) \n");
                                                end
                                                return '';
                                            end
                                        --Confidence
                                            if (transcribe_json["results"][1]["alternatives"][1]["confidence"]) then
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: " .. transcribe_json["results"][1]["alternatives"][1]["confidence"] .. "\n");
                                                end                                           
                                                confidence = transcribe_json["results"][1]["alternatives"][1]["confidence"];
                                            else
                                                if (debug["info"]) then
                                                    freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: (null) \n");
                                                end
                                            end
                                            
                                            return transcription;
                                    else
                                        if (debug["info"]) then
                                            freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: json error \n");
                                        end
                                        return '';   
                                    end
                                end
                        end
--


I'll continue to test this on my end. I can assist you in submitting this code to the Fusionpbx project on github, or if you don't care I can submit it for you can give you a shout out!
 
Wow! Awesome!

Yes I was using G722 so that's probably why narrowband was needed for G711.

Yes please submit for me, I based it on your azure script anyway, LOL!
 
I can't take credit for the Azure stuff. I think I just wrote some bug fixes :)

Good to know about the G722 vs G711. Let me see if we can include some codec detection so this doesn't break.
 
Sometimes Watson will return an % which the send_email script will not properly escape causing this:

5c980fc2-3014-4ae0-9e6e-12807ec96998 2019-07-29 17:36:18.530976 [NOTICE] sofia.c:1079 Hangup sofia/LAN/6043432757@10.1.5.115 [CS_EXECUTE] [NORMAL_CLEARING]
5c980fc2-3014-4ae0-9e6e-12807ec96998 2019-07-29 17:36:18.570898 [ERR] mod_lua.cpp:202 ...scripts/app/voicemail/resources/functions/send_email.lua:183: invalid use of % in replacement string
5c980fc2-3014-4ae0-9e6e-12807ec96998 stack traceback:
5c980fc2-3014-4ae0-9e6e-12807ec96998 [C]: in function gsub
5c980fc2-3014-4ae0-9e6e-12807ec96998 ...scripts/app/voicemail/resources/functions/send_email.lua:183: in function send_email
5c980fc2-3014-4ae0-9e6e-12807ec96998 /usr/share/freeswitch/scripts/app/voicemail/index.lua:588: in main chunk
5c980fc2-3014-4ae0-9e6e-12807ec96998 /usr/share/freeswitch/scripts/app.lua:48: in main chunk

It won't crash the voicemail routine but it won't deliver the email, which is a pain if its a stand alone vm with no mwi.

Looking through the script for a solution.
 
Makes sense. I'll try to replicate and come up with a work around. It might be a day or two though..
 
Makes sense. I'll try to replicate and come up with a work around. It might be a day or two though..

It's hard to replicate as Watson is sending it as some sort of hesitation detection code, see example below:

hi Mike it's Tracy Murphy speaking on awhile hope you're doing well %HESITATION not sure what happened my phone but if you can reset it when you have a moment that would be great and then maybe send me an email %HESITATION

I've seen it happen in 2 messages though in the last 48 hrs so its a problem
 
I fixed the error by adding another gsub line in send_email.lua.

We swap out a percent sign with an asterisk.

Code:
                    if (transcription ~= nil) then
                        transcription = transcription:gsub("%%", "*");
                        body = body:gsub("${message_text}", transcription);
                    end


Everything else working OK with Watson? If so, I'll submit the PR to source. Thanks for testing ;)

1564603917326.png
 
Lol, I'm not production, but I put my own phone on it because there's not test like eating your own dog food.

Everything looks OK, I was able to reproduce by going "uhhhh" in the test message.

this is a test *HESITATION this is a test *HESITATION yeah *HESITATION *HESITATION attest

Unless Watson throws more strange characters we should be ok.
 
Last edited:
I haven't upgraded yet and this is a noob question, but I was editing : /usr/share/freeswitch/scripts/app/voicemail/resources/functions/record_message.lua. I made a change to my voicemail box in the gui and all my Watson edits got nuked.

Should I be editing in a different location?
 
Status
Not open for further replies.