Google introduced “Help Me Write” to Gmail back in June of 2023, and now Gmail Voice Compose seeks to up that AI-powered convenience. AI tools have been making their way into all sorts of services and industries, and for good reason. Long gone are the days of wonky voice recognition and misbehaving mobile assistants. AI-powered voice recognition works wonders, naturally; as this was one of the first proposed uses of Large Language Models. Gmail’s newest upcoming feature will let you make use of this to craft professional emails just by speaking into your phone.
Gmail Voice Compose works just like “Help me write”
The “Help Me write” feature in the Gmail apps on Android and iOS lets users draft emails from snippets of text. Users write the main points of what their email is about, and AI does the rest. “Help Me Write” not only saves time, it drafts a complete and professional email from just a few lines of informal text. Voice Compose seems to work the same way, but by listening to a user speak instead of having them write.
As discovered by TheSpAndroid, the Gmail Android app version 2023.12.31.599526178 release contains a new feature accessible via toggling a flag value. This feature allows you to record yourself when writing an email by tapping a microphone button. When you’re done recording, hitting “Create” will prompt AI to create your email from your recording. Though very similar to the “Help me write” feature, Voice Compose is definitely faster and more convenient.
How does AI speech-to-text work?
AI speech-to-text is much more accurate at deciphering what someone says than older speech recognition models. The jump in capability was almost jarring, but also very understandable once you understand how LLMs work. Large Language Models, in a very simplified manner, are thought of as “word predictors”. As in, what they do is predict what word should come after the last one.
This definition does a disservice to the complex neural network LLMs possess, but it does help to understand how their speech recognition works. Older speech recognition models used to parse each sound and try to figure out what word it was. This would very often lead to completely nonsensical sentences. AI speech recognition doesn’t just listen to and parse each individual word. It compares it to everything that’s been said before and guesses what is most likely to have just been said. This is why AI speech recognition is miles ahead of older models; it has some intelligence behind it.
Gmail Voice Compose should be rolling out pretty soon as it’s been purported to have been in the works since October of last year. If it works as well as other AI speech-to-text models do, it will be a massive step forward in convenience for those who use the Gmail app.
2024-01-23 15:06:43