Hey @kywelch17, great question. This is definitely doable, and there are a number of possible approaches. It’s just a matter of how you prefer to implement things
Without knowing your requirements or setup, I think you’ll probably want to consider using a third-party natural language processing service to handle most of the voice interactions. Misty can capture speech, send the audio file to an API, and process the response from whatever service you want to use. This might be how you would get the text content of the user utterance back into your skill. You can then do whatever you want with that data – you can save it to the long-term storage that’s associated with that particular skill (the command for this is
And that’s just one possibility – don’t let this suggestion limit your thinking! Feel free to ask more questions or share more information about your use case here.