New skills for Misty_voice/speech


Hi everybody, I was thinking how cool it would be for misty to have some linguistics skills (possibly voice, similar to Alexa skills or smth like that) with a feature to add different voices (female/male).
I tried adding different greetings to Misty yesterday.
Is anybody interested in joining efforts and looking into some chatbot/Natural Language Processing algorithms/frameworks with me?
I can do Python/chatbots.


This would be cool. And great to meet you last night!


Thank you! Likewise :slight_smile:


I am interested. Shall we create a GitHub repo and begin to collect references on its wiki?

My handle on GitHub is slivingston.


That’s great! I just followed you on github. I am junior level JavaScript and Python, but have more than 20 years of linguistics experience + PhD in translation studies. LOVE both: human and (now also) computer languages! :slight_smile:


Hi Olena,

I sat across the table from you on that Wednesday evening.
I’m writing a book which has a few chapters about doing speech recognition and text-to-speech (TTS) on inexpensive hardware such as the Raspberry Pi. I’ve got some code written in Python that uses the Google cloud for speech recognition and TTS. There are various other speech APIs that are also available. The current Misty local API does not seem to support speech yet.
What do you have in mind?

  • Mike Seiler, MSEE


Yes, Mike, I remember! Excited to see you book!
I know what you mean. It looks like this is coming in the future: “Voice Integration – You can bring a voice to Misty by integrating with your choice of third party NLP or TTS provider. (2019 Pre-built NLP Integration w/Alexa)”.
I thought of maybe writing an Alexa skill, just for Misty?
Do you have other ideas?


I was there with you the other night too, and I am interested in learning more about your plans. I know some Python and used the Snack Toolkit a long time ago. I am sure things have progressed since then.


Awesome! That would be great! I will create a repo in a next couple of days, and then we can all work on it.


In terms of basic architecture, we can choose from at least one of the following:

  1. entirely within Misty skills (so, running onboard, internal to Misty robot)
  2. some combination of Misty skills and a backpack Raspberry Pi or Arduino
  3. JS skill making HTTP calls to some remote provider that we make
  4. same as previous, but with calls to a paid or otherwise opaque service like Google Cloud (Cloud Speech-to-Text - Speech Recognition  |  Cloud Speech-to-Text API  |  Google Cloud).

The main disadvantage of #3 and #4 is the dependence on Internet connection quality.

Besides the Snack Toolkit (referenced by @baghaii), some other software that might be useful without depending on opaque services:

Also, some research results and tools for NLP are listed at


I lean more towards #1 or #2. But having remote option #3 is great too (I could build a front-end, if we were to host it on a page, possibly).
As for topic, I thought of teaching Misty greetings in different foreign languages.
Once we decide, let’s start a repo.


how about we try #1 first because it requires the least additional materials besides an off-the-shelf Misty robot?

for the first skill, how about the following?
Misty waits to hear the name of a natural language in that language. if it hears and recognizes it, then responds with a greeting in the requested language. otherwise, displays a question mark on its screen.


  1. hears “English”, responds “hello”
  2. hears “español”, responds “hola”


Yes, that sounds like a plan


cool. how about we call the project Misty speech library (abbreviated: MSL), or to avoid trademark conflict, Speech Skills Library (abbreviated: Speech SL)? the latter has the advantage that it can be a pattern for naming other coordinated collections of Misty skills, e.g., “Telepresence SL” for skills that facilitate using Misty robots for telepresence.

addendum: as an example about naming, SciPy has the pattern of separate packages being named with the prefix scikit-, as described at SciKits - about. scikit-learn is a well-known example of one of these.


I started a repository “Misty speech library” here:

I coded a simple conversational bot using Python with regex and random. This is just a start, of course. Check it out.

Please add more stuff. We can add license info (there was info from Misty team), Wiki, code, libraries, and of course add voice features!


This is great to work with:

Just saw


Thank you so much! That’s very helpful. I was just thinking about how to keep the momentum and keep developing this.


thanks for creating the repo. to keep the momentum going, here are pull requests (PRs) that I can start:

  1. documentation, including notes from earlier messages in this thread (e.g., New skills for Misty_voice/speech)
  2. a minimal skill to play an audio recording from a file onboard the robot
  3. a minimal skill to record audio and save it somewhere

after doing the above, we might be ready to organize the repo a little more in terms of subdirectories and code style.

what do you think?


Yes, absolutely, thank you.
Regarding #2. I have already saved 2 audio files on Misty at the last event, I can record more.


today I learned about another NLP project that might be useful here: spaCy (

they also have a “universe” page that lists “resources developed with or for spaCy”,
some of which might be useful with Misty robots, e.g., mordecai (GitHub - openeventdata/mordecai: Full text geoparsing as a Python library), which can

Extract the place names from a piece of text, resolve them to the correct place, and return their coordinates and structured geographic information.