Google Cloud Speech to Text Voicemail Transcription

junction1153

New Member
Jul 15, 2020
6
1
3
29
I'd like to contribute my modifications to the /usr/share/freeswitch/scripts/app/voicemail/resources/functions/record_message.lua file, hopefully this will help someone in the future.

I am not a heavy programmer and am new to freeswitch and fusion, so don't mind that it is poorly written, but has so far proven to work well.

I prefer google over IBM because it also predicts punctuation when transcribing speech.

Do keep in mind that I modified the Watson script, and I know I can clean up the code a lot, but this will be once I am fully done tweaking my system in the future

Step 1. Make sure you have a google cloud API generated with the proper permissions to use the speech to text engine

Step 2. Advanced --> Default Settings --> Voicemail.
Add: Category: voicemail
Subcategory: transcribe_provider
Type: text
Value: watson
Enabled: true

Add: Category: voicemail
Subcategory: transcribe_enabled
Type: boolean
Value: true
Enabled: true

Add: Category: voicemail
Subcategory: json_enabled
Type: boolean
Value: true
Enabled: true

Step 3. Edit the record_message.lua file. Few lines unde: if (transcribe_provider == "watson") then -- you will see two transcribe_cmd lines. Replace both those entries with below. Be sure to replace MY_API_KEY with your API key from google.

transcribe_cmd = [[sox ]]..file_path..[[ ]]..file_path..[[.flac && echo "{ 'config': { 'languageCode': 'en-US', 'enableWordTimeOffsets': false }, 'audio': { 'content': '`base64 -w 0 ]]..file_path..[[.flac`' } }" | curl -X POST -H "Content-Type: application/json" -d @- "https://speech.googleapis.com/v1/speech:recognize?key=API_KEY_HERE" && rm -f ]]..file_path..[[.flac]]


This will only transcribe up to one minute of speech, per google's limitation. There is a way to transcribe longer, but well beyond my expertise and probably unnecessary for the majority of users.

Please let me know if you have any questions/comments/etc
 
  • Like
Reactions: DigitalDaz