[OSM-talk] pronunciation tag

Robert Vollmert rvollmert-lists at gmx.net
Tue Jun 24 10:02:18 BST 2008


Disclaimer: based on a little of web research; I have no particular  
knowledge of linguistics or speech synthesis.

On Jun 24, 2008, at 03:54, SteveC wrote:
> On 23 Jun 2008, at 18:52, Lauri Hahne wrote:
>
>> I think some standard form should be used if we ever want to do
>> something like this. Although IPA is the official standard, it isn't
>> very computer or user friendly. Therefore I think something like
>> SAMPA, MRPA or X-SAMPA should be used. These are used to some extend
>> among linguistics and are all based on ASCII. These would also  
>> relieve
>> the pain of trying to figure out what something would be in phonetic
>> pseudo-english.
>
> can you summarise these with examples?


supercalifragilisticexpialidocious
IPA: /ˌsuːpɚˌkælɪˌfrædʒəlˌɪstɪkˌɛkspiːˌælɪ 
ˈdoʊʃəs/
CXS: /"su:p@`"k&lI"fr&dZ at l"IstIk"Ekspi:"&lI'doUS at s/

CXS is basically X-SAMPA, which is basically an ASCII-encoding of IPA.  
Since we do unicode, I'd think we should rather go with IPA. See http://www.theiling.de/ipa/ 
  for an online converter.

I didn't find any speech synthesis package that does IPA directly,  
though. Festival's "Sable" markup language http://www.cstr.ed.ac.uk/projects/festival/manual/festival_10.html#SEC33 
  provides for IPA, though festival doesn't implement this. It does  
allow e.g.

<PRON SUB="toe maa toe">tomato</PRON>.

A possible alternative is the free-as-in-beer mbrola http://tcts.fpms.ac.be/synthesis/mbrola/ 
. It's a speech synthesis backend based on diphones (two halves of  
phones). Its input format appears to be SAMPA plus additional data.  
There's still some language dependency in there, though. Espeak http://espeak.sourceforge.net/ 
  can target mbrola, perhaps IPA could be added as a language?

Cheers
Robert





More information about the talk mailing list