Hey Olena and Scott,
I’m one of the developers here at Misty and just stumbled across this conversation. Just wanted to pop in and address a few things as this is a portion of the system that I’m actively working on. I can tell you that we do plan on having quite a few things available on Misty’s local APIs.
- Wake Word - likely “Hey Misty” (unless you guys have other ideas?)
- Source Localization - this will tell you where the direction of the predominant speaker, relative to Misty
- Command Capture - after hearing the wake word, will put the voice command into a wav file and notify your skill when it is ready (so you can upload it to whatever service you want)
I would also like to integrate #3 at a lower level in the robot so that the command/response latency isn’t as perceptible. Right now when you make an external request in a skill, that audio data gets routed through several portions of our system rather inefficiently which adds considerable latency in remote command/response skills.
If you guys have other ideas on what you would like to see on Misty, let me know either here on the forums or you can DM me on the community site. Eventually I would like to have some sort of speech-to-intent service available on Misty that you could either configure or maybe even replace the implementation (high hopes for modularity). We’re still working on tuning the microphones and speakers but once that is complete, then the fun really begins.
If you have a few minutes, your votes on our Robot Roadmap really do count and it’s how we prioritize features that we work on.