It was early 2021. I failed miserably with a Spanish-language podcast I released a year earlier, where I aimed to be unapologetically sarcastic and palpably dishonest talking about current events. Covid and the lockdowns made it difficult to be ironic, or humorous about anything. Humanity was evidently pissed and paranoid. I don’t blame them, but I do remember warning people about imminent inflation because of all the money that rained down upon us, as if everyone had won the lottery at the cost of our freedom and mental health. That podcast didn’t get listeners, and decided to shut it down. I do not regret deleting every microsecond of its content.
Fiction Madness premiered two years earlier, in 2019, with Harold: The Man Who Invented God, followed by I Envy The Living. Both episodes were edited on an iPhone 6 using nothing but two apps, one of them being Hokusai which I still use to cut and put together audio tracks. When President Trump blessed my life with the first $1,200 stimulus check, I realized the opportunity bestowed upon me to upgrade my editing gadgetry. I invested most of that money on a new tablet, wireless keyboard, studio microphone, and an upgraded version of Hokusai.
After quitting the sarcastic Spanish podcast, I wanted to return to Fiction Madness. For some reason, I thought FM had no chance of becoming anything. I was wrong. Over time, it gained listeners without needing to promote it whatsoever. Also, telling stories is my thing. The pleasure I draw during the process is beyond therapeutic. After resuming production on the podcast, I was faced with a professional dilemma.
I needed voices for my characters. Unfortunately, I’m not affluent enough to pay actors and have them provide authentic voices. The best ones are expensive, as they should be. Actors employ sheer strength and spirit in their performance. It is about projecting real emotions, which goes beyond merely reading from a script. My solution, at least temporary, meant going down the cheap path. I spent a few months experimenting with several AI-generated voice platforms, but only one delivered correct pronunciations after typing text. (*Watch the video at the top)
Talking about the process, or explaining in greater detail how it is done, would be an absolute bore. My best advice, if you wish to delve into this innovative venue, is to take your time experimenting it. Test as many voices you can, apply different pitches, emphasis, speed, and all the features it provides. Don’t expect these platforms to deliver voices exactly as you hear them in your head. It won’t happen. You will have to work on them. Audio editing also helps giving these voices proper pacing.
Achieving near-perfect sounding voices will be painfully frustrating at first. You may not be fully satisfied with the finished product, which is perfectly fine, but since Silicon Valley has made it their life goal to plunge voice actors into starvation, I wouldn’t be surprised if this technology improves significantly in the next few years.
Hopefully, I won’t have to rely forever on artificial intelligence for this particular necessity, but it has been a great tool thus far. I use my voice to deliver the narration. The character voices themselves are almost secondary, next to the mix of sound effects which aids in illustrating settings, traits, situations, and entertaining moments. If my story scripts were more theatrical and dialog-heavy, then AI-generated voices would be pointless, unless my characters were bots with their own unique personalities.