Feb 16, 2017

Text to Speech with eSpeak and Epos

A humanoid robot should be able to talk. So I looked around for some open source speech synthesis software.

(The above video does feature a talking robot (and a multilingual dolphin) but that's where similarities with the following content end.)

eSpeak

Hello world:

espeak 'Hello, world!'

Standard input works too:

espeak <<EOS
A robot may not injure a human being or, through inaction,
allow a human being to come to harm.
EOS

I need the robot to speak Czech too:

espeak -v cs 'Dobrý den!'

Chinese also seems to work, at least to my beginner ear:

espeak -v zh '认识你很高兴'
# The same in pinyin
espeak -v zh 'ren4shi ni3 hen3 gao1xing4'

To put the words to the robot's mouth we first need to save the sound to a file:

espeak -w dobry-den.wav -v cs 'Dobrý den!'    # 16 bit, mono 22050 Hz

Now a thing that is not so useful for the robot, but a cool diversion. This tells eSpeak to be quiet, and transcribe the text in International Phonetic Alphabet.

espeak -q --ipa 'All human beings are born free and equal
  in dignity and rights. They are endowed with reason and conscience
  and should act towards one another in a spirit of brotherhood.'

ˈɔːl hjˈuːmən bˈiːɪŋz ɑː bˈɔːn fɹˈiː and ˈiːkwəl ɪn dˈɪɡnɪti and ɹˈaɪts

ðeɪ ɑːɹ ɛndˈaʊd wɪð ɹˈiːzən and kˈɒnʃəns and ʃˌʊd ˈakt tʊwˈɔːdz wˈɒn ɐnˈʌðəɹ ɪn ɐ spˈɪɹɪt ɒv bɹˈʌðəhˌʊd

And it also works for Czech:

espeak -q -v cs --ipa 'Všichni lidé rodí se svobodní a sobě rovní
  co do důstojnosti a práv. Jsou nadáni rozumem a svědomím
  a mají spolu jednat v duchu bratrství.'

fʃˈixɲi lˈideː rˈoɟiː se svˈobodɲiː a sˈobje rˈovɲiː tsˈo do dˈuːstojnˌosci a prˈaːv

jsoʊ nˈadaːɲi rˈozumem a svjˈedomiːm a mˌajiː spˈolu jˈednat v dˈuxu brˈatr̩stviː

epos

The problem with eSpeak is that it sounds quite robotic. I remembered that for Czech, the epos system was much better, also for its availability of better quality downloadable voices.

I installed epos (here as an openSUSE RPM) and downloaded the high quality voices epos-tdp.tgz, then unpacked them to the right place:

cd /usr/share/epos/inv
sudo tar xvf .../epos-tdp.tgz

At first I got no sound but strace showed me a problem with /dev/dsp and a bit of searching turned out that I must run eposd with a dsp wrapper:

padsp eposd $OPTIONS
# eg.
padsp eposd --voice machac
padsp eposd --voice violka

Another quirk is that epos wants the input in ISO Latin 2, so I used iconv:

while read S; do say-epos $(echo "$S" | iconv -f utf8 -t l2); done

For saving the sound to a file, use -w to use a fixed file name ./said.wav, or -o to use stdout:

say-epos -w Ahoj
say-epos -o Ahoj > ahoj.wav

Other systems?

The thing that reminded me of epos was this summary written by a small Czech phone operator.

Have you tried text-to-speech software? Which one sounds the best?

No comments: