Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audi
On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize audio of that person saying anything—and do it in a way that attempts to preserve the speaker’s emotional tone. Further Reading Read more about Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audi[…]