GitHub - phwhite/texttospeech: FreePBX Text-To-Speech Module (Cepstral Swift, eSpeak, fLite, Google-TTS, Microsoft-TTS)

phwhite / texttospeech Public

Notifications You must be signed in to change notification settings
Fork 3
Star 8

FreePBX Text-To-Speech Module (Cepstral Swift, eSpeak, fLite, Google-TTS, Microsoft-TTS)

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
agi-bin		agi-bin
assets		assets
bin		bin
engines		engines
helpers		helpers
sources		sources
utils		utils
views		views
ChangeLog		ChangeLog
README		README
functions.inc.php		functions.inc.php
install.php		install.php
module.xml		module.xml
page.texttospeech.php		page.texttospeech.php
play_audio.html.php		play_audio.html.php
texttospeech.php		texttospeech.php
uninstall.php		uninstall.php

Repository files navigation

############################################################################
TextToSpeech Module for FreePBX, v2.0.0.0
############################################################################

Original Author: _xo_ - Xavier Ourciere (Orig Release: 2006-07-06)
Previous Modder: JakFrost (Last Updated on 2009-11-25, v1.3.1.3)

Current Maintainer: Paul White ([email protected])

Contributed Modules Documentation:
    http://www.freepbx.org/support/documentation/module-documentation/third-party-unsupported-modules

Contributed Modules Direct Mirror:
    http://mirror.freepbx.org/modules/release/contributed_modules/


============================================================================
Description
============================================================================

This module provides the ability to create and manage Text To Speech entries
within FreePBX.  These entries can be used many different ways, of which the
top ones are: As custom system recordings created in the asterisk dir
"sounds/texttospeech/", and as interim destinations where the text will be
spoken and then the user forwarded on to another destination or returned to
an IVR.

The text itself can be statically configured, or fetched from several
different sources such as a text file, a shell command, or a URL.  It can be
configured to synthesize the text once (static mode), or fetch the text
and re-synthesize each time the text-to-speech entry is used (dynamic mode).

The text can be synthesized using one of several different engines, including
Google TTS, Microsoft TTS, Cepstral SWIFT, eSpeak, FLite, and Text2Wave.  

For those of you who have the newer versions of Cepstral SWIFT that does not
include the save-to-file license by default, a dynamic option is provided for
the Cepstral SWIFT engine which will then utilize the Asterisk app_swift
application each time the text-to-speech entry is used.


============================================================================
Text-To-Speech Entry Configuration Documentation
============================================================================

Settings
--------

Name
	The Name of the Text To Speech entry to help you identify
	it.  It can only be composed of alpha-numeric characters
	(a-z A-Z 0-9), the underscore (_) and dash (-) but no
	spaces or other characters.

Source
	Text Source Module to use

Engine
	Engine Module to use

Allow Playback Control
	Allows user to control playback (Pause, FF, Rew)
	NOTE: When enabled, all dynamic capabilities are not allowed.

	When Checked...
		Abort			Key used to abort playback (optional)
		Rewind			Key used to rewind playback
		Pause			Key used to pause playback
		Forward			Key used to fast-forward playback
		Time			Amount of time in ms to Rewind or Fast-forward


Allow Aborting Playback
	Allows user to press any key to abort playback


Source 'Text' Settings
----------------------

Text
	The text that you wish to be spoken


Source 'Command' Settings
-------------------------

Dynamic
	When enabled, the command will be ran each time the text-to-speech
	entry is used.  When not enabled, the command is ran once during
	adding/modifying the entry, and the output is cached.

	When checked...
		Destination On Fail			Destination if command fails

Command
	The path, command and arguments


Source 'File' Settings
----------------------

Dynamic
	When enabled, the file will be read each time the text-to-speech
	entry is used.  When not enabled, the file is read once during
	adding/modifying the entry, and is cached.

	When checked...
		Destination On Fail			Destination if file doesn't exist

Filename
	The full path and filename of file


Source 'URL' Settings
---------------------

Dynamic
	When enabled, the URL will be fetched each time the text-to-speech
	entry is used.  When not enabled, the URL is fetched once during
	adding/modifying the entry, and is cached.

	When checked...
		Timeout						Number of seconds to wait for URL
		Destination On Fail			Destination if unable to fetch URL

URL
	The URL (beginning with http://, https://, or ftp://)

Strip Tags
	When checked, all HTML tags returned by the URL are stripped


Engine 'Google Translation' Settings
------------------------------------

Language
	The language in which the text should be spoken

Speed Factor
	Speed in which the synthesized text should be played back


Engine 'Microsoft Translation' Settings
---------------------------------------

Language
	The language in which the text should be spoken

Client ID
	Your Client ID that was registered with Microsoft (see end of README
	for more information)

Client Secret
	Your Client Secret that was regsitered with Microsoft (see end of README
	for more information)

Make Default
	When checked, the Client ID and Client Secret provided are saved in 
	the /etc/asterisk/microsoft-tts.conf file to be used as default values
	when using the microsoft engine.


Engine 'Cepstral Swift' Settings
--------------------------------

Dynamic
	When enabled, The text will be synthesized using the Asterisk app_swift
	application, each time the text-to-speech entry is used.  This does not
	require generating a sound file, thus the additional save-to-file license
	is not required from Cepstral.

Voice
	The voice to use.  Note that this can be overridden if the SSML markup
	language is used in the source text.

Arguments
	Arguments to be passed to the swift command line utility.  Not applicable
	if dynamic is enabled.


Engine 'eSpeak' Settings
------------------------

Voice
	Voice to use.

Arguments
	Arguments to be passed to the espeak command line utility.


Engine 'Festival Flite' Settings
--------------------------------

Arguments
	Arguments to be passed to the flite command line utility.


Engine 'Festival Text2Wave' Settings
------------------------------------

Arguments
	Arguments to be passed to the text2wave command line utility.



============================================================================
Using Microsoft Translation Text-To-Speech Engine
============================================================================

The Microsoft Translation engine takes advantage of Microsoft's Translation
API.  To gain access to this API, you must first subscribe to the Translation
API service through Microsoft's Azure Marketplace.  They do provide a free
subscription which includes up to 2 million characters a month.

If you don't already have a subscription or client ID/secret, please follow
the steps below.  You can find out more detailed information by visting:
	http://msdn.microsoft.com/en-us/library/hh454950.aspx


Step 1
------

Subscribe to the Microsoft Translator API on Azure Marketplace.

Go to this URL:
	https://datamarket.azure.com/dataset/1899a118-d202-492c-aa16-ba21c33c06cb

You will then need to sign-in using your Microsoft Live account, or
register a new one.

Once you've chosen a subscription (such as the 2,000,000 chars/month, you
can move on to step 2.


Step 2
------

You now need to create an application registeration to make use of the
Microsoft Translator API subscription.

Go to this URL and click on the green 'REGISTER' button:
	https://datamarket.azure.com/developer/applications/

Fill out the form as follows:

	Client ID
		Pick a unique ID here, example: "<your name>_pbx_mstts"

	Name
		Choose any name, example: "<your name> MS-TTS Asterisk App"

	Client Secret
		You can use the randomized one already filled in, or
		create your own

	Redirect URI
		This isn't used for the translator API, however the Azure site
		requires it to be filled in.  I suggest using a value such as
		"http://localhost/mstts_oauth_response.aspx"

Before you hit the 'CREATE' button, you need to write down and save
the Client ID and the Client Secret.  These are the two values
you will need to provide in the text-to-speech configuration.