Speech Recognition in ROS / Linux


Speech recognition in ROS/Linux has been has been traditionally done using projects like CMU-Sphinx or Julius. But they lack an efficient vocabulary  and is not stable. So reliable speech recognition was confined to Windows/Mac users only. Initially I was using a windows virtual  machine inside ubuntu to do speech processing, even though it was quite resource consuming. A good alternative is to use the speech recognition built into Chrome by Google. The speech samples are sent to Google’s servers for processing and they return the recognized speech and a confidence value.It is quite easy to use this possibility of speech recognition. It also offers an advantage of speaker independent recognition of speech. The only disadvantage is the delay caused in detection. It normally takes about 3 seconds for the speech to be recognized.A simple python script for speech recognition is shown below

I have also created a ROS package for speech recognition. It can be run by checking out theGithub  repo, and running  ‘rosrun gspeech gspeech.py‘. It will publish two topics: /speech and /confidence. The first one is the detected speech while the latter one is the confidence level of detection

One thought on “Speech Recognition in ROS / Linux

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s