Mod cepstral
From FreeSWITCH Wiki
Contents |
Install & Configure
- Buy or download a free trial voice from Cepstral. Each voice comes with the library, so the SDK is not needed.
- cd /opt
- wget http://downloads.cepstral.com/cepstral/i386-linux/Cepstral_Allison-8kHz_i386-linux_4.2.1.tar.gz
- tar xvfz Cepstral_Allison-8kHz_i386-linux_4.2.1.tar.gz
- cd Cepstral_Allison-8kHz_i386-linux_4.2.1
- ./install.sh
- Add /opt/swift/lib (if you chose the default install) to /etc/ld.so.conf and run ldconfig
- Define SWIFT_HOME to point to install root, eg, /opt/swift
- Edit modules.conf and uncomment the line: asr_tts/mod_cepstral
- Build FreeSWITCH
- Enable mod_cepstral in the modules.conf.xml file by uncommenting <load module="mod_cepstral"/>
You can also use a Cepstral voice with a language other than english without editing any files. <lang> is the voices language, i.e. de or fr. Just add two links in $SWIFT_HOME/lib:
- libceplang_en.so -> libceplang_<lang>.so.4.2
- libceplex_en.so -> libceplex_<lang>.so.4.2
Examples
Dialplan
You should now be able to use something similar to the following in your dialplan
<action application="speak" data="cepstral|david|Please hold while we connect you to the conference"/>
Javascript/Python
session.answer()
session.speak("cepstral","William","Hello from FreeSwitch")
Gotchas
- Do *not* load mod_cepstral and mod_flite at the same time! (Symbol collision)
- If you dont use the default install dir (/opt/swift) you will need to modify src/mod/asr_tts/mod_cepstral/Makefile
- You must define an environment variable SWIFT_HOME in the shell where you run fs, otherwise you won't hear any audio.
- Using a 16khz voice and 0.03 for RTP Packet Size (Sipura Setting), it will sound horrible. Workaround: modify RTP Packet Size to 0.02 in Sipura config, under Advanced/SIP section.
- If audio gets cut off at the beginning, try using
<break time='1s' /> tags
as a workaround.
- The current mod_cepstral (as of 29 December 2007) adds 1s of silence before and after each utterance for you.
- 's get stripped out from strings run through FreeSWITCH's code to chop strings of the form a|b|c into bits. Use "s instead if, for example, you want to pass something like
<prosody rate="fast">Hello there.</prosody>
to Cepstral.
- If you find that the volume of your TTS is much higher (or lower) than that of the sound files, try decreasing the volume using the 'volume' tag with Cepstral's SSML. For example, this will lower the TTS volume significantly:
<prosody volume='15'>This is pretty softly spoken.</prosody>
The '15' in the above example means 15% of default volume.
- For other SSML tricks check out the examples on Cepstral's support site.
Windows Build
In order to compile mod_cepstral.c under Visual Studio C++ you must ensure the Cepstral SDK is installed on your build machine. The SDK is not free. You can, however, obtain an eval copy if you email brian@freeswitch.org with subject line "Cepstral Windows SDK".
Once the SDK is installed you'll need to make sure mod_cepstral is seleted to be compiled (not on by default). Right click the Freeswitch solution from the Solution Explorer in VS and select Configuration Manager. Scroll down until you see mod_cepstral and select the Bulid flag.
In addition you need to verify the following properties for mod_cepstral.c (right click mod_cepstral from the Solution Explorer on the left hand side and select "properties").
- Additional Include Directories (from C/C++, General): This path should be set to "C:\Program Files\Cepstral\sdk\include"
- Additional Library Directories (from Linker, General): This path should include "C:\Program Files\Cepstral\sdk\lib\windows" and "C:\Program Files\Cepstral\sdk\lib\winnt". Between Cepstral 4.2 and 5.0 these paths changed.
Finally, you'll need to make sure the the Cepstral bin path is part of the Windows PATH environment variable as the Cepstral dlls are installed in this directory (C:\Program Files\Cepstral\bin ). Without this path mod_cepstral.dll will not initialize during Freeswitch startup.
FAQ
Can I use a 16khz "desktop voice"?
Q: Can I use a 16khz "desktop voice" or do I have to use an 8kz telephone voice?
A: You can use a 16kz voice and freeswitch will re-sample automatically to 8khz as needed. Bear in mind this will add to your cpu overhead, so an 8khz voice is better from a performance perspective.
libswift.so.4: cannot open shared object file: No such file or directory
Try manually adding the /opt/swift/lib directory to /etc/ld.so.conf or /etc/ld.so.conf.d/ and run ldconfig.
See Also
FreeSwitch Dependencies, Session speak, Mod openmrcp
Categories: TTS | Integration | Modules
