Android; A really easy tutorial on how to use Text To Speech (TTS) and how you can enter text and have it spoken

Hello readers, time for another little android post of mine. I’ve been working hard recently on some voice recognition and TTS work, so figured I’d post up some tutorials for people to see how easy it is. This is also my first post where I host content on Github so please bear with me!

TTS is actually incredibly easy, take a look at this first example activity :


/**
 * Author : James Elsey
 * Date : 26/Feb/2011
 * Title : TextToSpeechDemo
 * URL : Http://www.JamesElsey.co.uk
 */
package com.jameselsey.demo.texttospeechdemo;

import android.app.Activity;
import android.content.Intent;
import android.os.Bundle;
import android.speech.tts.TextToSpeech;

/**
 * This class demonstrates checking for a TTS engine, and if one is
 * available it will spit out some speak.
 */
public class Main extends Activity implements TextToSpeech.OnInitListener
{
    private TextToSpeech mTts;
    // This code can be any value you want, its just a checksum.
    private static final int MY_DATA_CHECK_CODE = 1234;

    /**
     * Called when the activity is first created.
     */
    @Override
    public void onCreate(Bundle savedInstanceState)
    {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

        // Fire off an intent to check if a TTS engine is installed
        Intent checkIntent = new Intent();
        checkIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
        startActivityForResult(checkIntent, MY_DATA_CHECK_CODE);

    }

    /**
     * Executed when a new TTS is instantiated. Some static text is spoken via TTS here.
     * @param i
     */
    public void onInit(int i)
    {
        mTts.speak("Hello folks, welcome to my little demo on Text To Speech.",
                TextToSpeech.QUEUE_FLUSH,  // Drop all pending entries in the playback queue.
                null);
    }


    /**
     * This is the callback from the TTS engine check, if a TTS is installed we
     * create a new TTS instance (which in turn calls onInit), if not then we will
     * create an intent to go off and install a TTS engine
     * @param requestCode int Request code returned from the check for TTS engine.
     * @param resultCode int Result code returned from the check for TTS engine.
     * @param data Intent Intent returned from the TTS check.
     */
    public void onActivityResult(int requestCode, int resultCode, Intent data)
    {
        if (requestCode == MY_DATA_CHECK_CODE)
        {
            if (resultCode == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS)
            {
                // success, create the TTS instance
                mTts = new TextToSpeech(this, this);
            }
            else
            {
                // missing data, install it
                Intent installIntent = new Intent();
                installIntent.setAction(
                        TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA);
                startActivity(installIntent);
            }
        }
    }

    /**
     * Be kind, once you've finished with the TTS engine, shut it down so other
     * applications can use it without us interfering with it :)
     */
    @Override
    public void onDestroy()
    {
        // Don't forget to shutdown!
        if (mTts != null)
        {
            mTts.stop();
            mTts.shutdown();
        }
        super.onDestroy();
    }
}

Its really simple, what we do in the onCreate method is to fire off an intent to check if there is a valid TTS engine installed on the device. Think of it this way, if the device doesn’t have an engine which it can use for converting text to speech, how can this work? We use this check to see if we need to direct the user to download one.

The callback on this check is the onActivityResult, which will be invoked after the check has complete. In this method we first check the MY_DATA_CHECK_CODE, this value can be anything you like, I’ve just chosen 1234. This is basically a checksum to check that the intent we receive in the onActivityResult is the one we’re expecting.

After that we check the result code, if it is CHECK_VOICE_DATA_PASS that means there is a valid TTS engine available, so we can create a TextToSpeech instance. If we don’t get a CHECK_VOICE_DATA_PASS then we assume there is either no TTS engine, or the one we have doesn’t work properly, so we create another intent that directs the user off to install one.

If the above was a success, then by creating a new TextToSpeech object we’ll invoke the onInit method, this is quite simple all it does is call the speak method on the TextToSpeech object and passes it a static String to speak.

The QUEUE_FLUSH is a nifty little feature which flushes out anything else the engine is trying to speak.

Checkout the branch I’ve created on Github, you can download the complete source code for this and have a play.

Thats all nice and dandy, but having a static String to be spoken is a little boring, lets have a look at how we can modify this so that you can have a text box, type in some text and hit a button to have it spoken. Have a look at the following Activity, its similar to the above yet improved to do dynamic speech.

/**
* Author : James Elsey
* Date : 26/Feb/2011
* Title : TextToSpeechDemo
* URL : Http://www.JamesElsey.co.uk
*/
package com.jameselsey.demo.texttospeechdemo;

import android.app.Activity;
import android.content.Intent;
import android.os.Bundle;
import android.speech.tts.TextToSpeech;
import android.view.View;
import android.widget.EditText;

/**
* This class demonstrates checking for a TTS engine, and if one is
* available it will spit out some speak based on what is in the
* text field.
*/
public class Main extends Activity implements TextToSpeech.OnInitListener
{
private TextToSpeech mTts;
// This code can be any value you want, its just a checksum.
private static final int MY_DATA_CHECK_CODE = 1234;

/**
* Called when the activity is first created.
*/
@Override
public void onCreate(Bundle savedInstanceState)
{
super.onCreate(savedInstanceState);
setContentView(R.layout.main);

// Fire off an intent to check if a TTS engine is installed
Intent checkIntent = new Intent();
checkIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
startActivityForResult(checkIntent, MY_DATA_CHECK_CODE);

}

/**
* This method is bound to the speak button so is invoked anytime that is clicked.
* @param v View default view.
*/
public void speakClicked(View v)
{
// grab the contents of the text box.
EditText editText = (EditText) findViewById(R.id.inputText);

mTts.speak(editText.getText().toString(),
TextToSpeech.QUEUE_FLUSH, // Drop all pending entries in the playback queue.
null);
}
/**
* Executed when a new TTS is instantiated. We don’t do anything here since
* our speech is now determine by the button click
* @param i
*/
public void onInit(int i)
{

}

/**
* This is the callback from the TTS engine check, if a TTS is installed we
* create a new TTS instance (which in turn calls onInit), if not then we will
* create an intent to go off and install a TTS engine
* @param requestCode int Request code returned from the check for TTS engine.
* @param resultCode int Result code returned from the check for TTS engine.
* @param data Intent Intent returned from the TTS check.
*/
public void onActivityResult(int requestCode, int resultCode, Intent data)
{
if (requestCode == MY_DATA_CHECK_CODE)
{
if (resultCode == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS)
{
// success, create the TTS instance
mTts = new TextToSpeech(this, this);
}
else
{
// missing data, install it
Intent installIntent = new Intent();
installIntent.setAction(
TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA);
startActivity(installIntent);
}
}
}

/**
* Be kind, once you’ve finished with the TTS engine, shut it down so other
* applications can use it without us interfering with it :)
*/
@Override
public void onDestroy()
{
// Don’t forget to shutdown!
if (mTts != null)
{
mTts.stop();
mTts.shutdown();
}
super.onDestroy();
}
}

The main changes here is removing the code from the onInit method, we do this because when we create the TTS object we don’t actually know at that point what text we want spoken, so we leave it blank.

Another change is the introduction of the speakClicked method, there is a button which is bound to this method, so when that button is clicked this method is invoked (I wrote another tutorial on that, please check it out).

So when the button is clicked, we’ll take whatever text is in the EditText and have that spoken, simple as that!

I’ve put all this in a separate branch on my Github so feel free to check it out.

Lastly, don’t forget to stop and shutdown the TTS object when you’re done, you don’t want it causing issues for other applicaitons that may want to use the TTS engine.

Further reading

Android; How to implement voice recognition, a nice easy tutorial…

I’ve been searching for a simple tutorial on using voice recognition in Android but haven’t had much luck. The Google official documentation provide an example of the activity, but don’t speculate anything more than that, so you’re kind of on your own a little.

Luckily I’ve gone through some of that pain, and should make this easy for you, post up a comment below if you think I can improve this.

I’d suggest that you create a blank project for this, get the basics, then think about merging VR into your existing applications. I’d also suggest that you copy the code below exactly as it appears, once you have that working you can begin to tweak it.

With your blank project setup, you should have an AndroidManifest file like the following

<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
      package="com.jameselsey"
      android:versionCode="1"
      android:versionName="1.0">
    <application android:label="VoiceRecognitionDemo" android:icon="@drawable/icon"
            android:debuggable="true">
        <activity android:name=".VoiceRecognitionDemo"
                  android:label="@string/app_name">
            <intent-filter>
                <action android:name="android.intent.action.MAIN" />
                <category android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
    </application>
</manifest> 

And have the following in res/layout/voice_recog.xml :

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="fill_parent"
    android:layout_height="fill_parent"
    android:orientation="vertical">

    <TextView
        android:layout_width="fill_parent"
        android:layout_height="wrap_content"
        android:paddingBottom="4dip"
        android:text="Click the button and start speaking" />

    <Button android:id="@+id/speakButton"
        android:layout_width="fill_parent"
        android:onClick="speakButtonClicked"
        android:layout_height="wrap_content"
        android:text="Click Me!" />

    <ListView android:id="@+id/list"
        android:layout_width="fill_parent"
        android:layout_height="0dip"
        android:layout_weight="1" />

 </LinearLayout>

And finally, this in your res/layout/main.xml :

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:orientation="vertical"
    android:layout_width="fill_parent"
    android:layout_height="fill_parent"
    >
<TextView  
    android:layout_width="fill_parent" 
    android:layout_height="wrap_content" 
    android:text="VoiceRecognition Demo!"
    />
</LinearLayout>

So thats your layout and configuration sorted. It will provide us with a button to start the voice recognition, and a list to present any words which the voice recognition service thought it heard. Lets now step through the actual activity and see how this works.

You should copy this into your activity :

package com.jameselsey;

import android.app.Activity;
import android.os.Bundle;
import android.content.Intent;
import android.content.pm.PackageManager;
import android.content.pm.ResolveInfo;
import android.speech.RecognizerIntent;
import android.view.View;
import android.widget.ArrayAdapter;
import android.widget.Button;
import android.widget.ListView;
import java.util.ArrayList;
import java.util.List;

/**
 * A very simple application to handle Voice Recognition intents
 * and display the results
 */
public class VoiceRecognitionDemo extends Activity
{

    private static final int REQUEST_CODE = 1234;
    private ListView wordsList;

    /**
     * Called with the activity is first created.
     */
    @Override
    public void onCreate(Bundle savedInstanceState)
    {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.voice_recog);

        Button speakButton = (Button) findViewById(R.id.speakButton);

        wordsList = (ListView) findViewById(R.id.list);

        // Disable button if no recognition service is present
        PackageManager pm = getPackageManager();
        List<ResolveInfo> activities = pm.queryIntentActivities(
                new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH), 0);
        if (activities.size() == 0)
        {
            speakButton.setEnabled(false);
            speakButton.setText("Recognizer not present");
        }
    }

    /**
     * Handle the action of the button being clicked
     */
    public void speakButtonClicked(View v)
    {
        startVoiceRecognitionActivity();
    }

    /**
     * Fire an intent to start the voice recognition activity.
     */
    private void startVoiceRecognitionActivity()
    {
        Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
                RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
        intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Voice recognition Demo...");
        startActivityForResult(intent, REQUEST_CODE);
    }

    /**
     * Handle the results from the voice recognition activity.
     */
    @Override
    protected void onActivityResult(int requestCode, int resultCode, Intent data)
    {
        if (requestCode == REQUEST_CODE && resultCode == RESULT_OK)
        {
            // Populate the wordsList with the String values the recognition engine thought it heard
            ArrayList<String> matches = data.getStringArrayListExtra(
                    RecognizerIntent.EXTRA_RESULTS);
            wordsList.setAdapter(new ArrayAdapter<String>(this, android.R.layout.simple_list_item_1,
                    matches));
        }
        super.onActivityResult(requestCode, resultCode, data);
    }
}

Breakdown of what the activity does :

Declares a request code, this is basically a checksum that we use to confirm the response when we call out to the voice recognition engine, this value could be anything you want. We also declare a ListView which will hold any words the recognition engine thought it heard.

The onCreate method does the usual initialisation when the activity is first created. This method also queries the packageManager to check if there are any packages installed that can handle intents for ACTION_RECOGNIZE_SPEECH. The reason we do this is to check we have a package installed that can do the translation, and if not we will disable the button.

The speakButtonClicked is bound to the button, so this method is invoked when the button is clicked. I wrote another tutorial on button binding so have a look at that if you don’t understand this method.

The startVoiceRecognitionActivity invokes an activity that can handle the voice recognition, setting the language mode to free form (as opposed to web form)

The onActivityResult is the callback from the above invocation, it first checks to see that the request code matches the one that was passed in, and ensures that the result is OK and not an error.

Next, the results are pulled out of the intent and set into the ListView to be displayed on the screen.

Notes on debugging :

You won’t have a great deal of luck running this on the emulator, there may be ways of using a PC microphone to direct audio input into the emulator, but that doesn’t sound like a trivial task. Your best bet is to generate the APK and transfer it over to your device (I’m running an Orange San Francisco / ZTE Blade).

If you experience any crashing errors (I did), then connect your device via USB, enable debugging, and then run the following command from a console window

<android-home>/platform-tools/adb -d logcat

What this does, is invoke the android device bridge, the -d switch specifies to run this against the physical device, and the logcat tells the ADB to print out any device logging to the console.

This means anything you do on the device will be logged into your console window, it helped me find a few null pointer issues.

That’s pretty much it, its quite a simple activity. Have a play with it and if you have any comments please let me know. You may have mixed results with what the recognition thinks you have said, I’ve had some odd surprises, let me know!

Happy coding!

Further Reading :

  1. ADB Commands
  2. Official documentation