My AI voice clone project for a client went totally off the rails last Tuesday
I was in my home office in Austin, finishing a custom voice model for a local author's audiobook. I used ElevenLabs to clone his voice from a 30-minute sample. The training seemed fine, but when I ran the first test script, the AI started adding these weird, aggressive coughs and throat-clearing sounds between sentences. It was nothing like the clean sample. I spent three hours checking the audio files, adjusting the training settings, and even re-uploading everything. The problem turned out to be a single 2-second clip in the training data where the author took a sip of water and cleared his throat. The model latched onto that tiny noise and amplified it. I had to manually edit that clip out and retrain the whole model from scratch, which took another 6 hours. Has anyone else had a voice clone pick up on a random audio artifact and run with it? What's your fix?