Working project (include your own API key if cloning) for converting audio recordings into text and then creating new stock background video based off the contents of that audio recording, and also allowing for realtime generation of stock footage while speaking