How does Siri recognize your voice?

Many of us were mesmerized by JARVIS, Tony Stark’s virtual assistant, in the 2008 Marvel movie Iron Man.

JARVIS was originally designed to serve as a GUI. An AI system was later implemented to oversee operations and provide global security.

JARVIS showed us what is possible with speech recognition technology. Even if we have yet to get there, the advancements are being used in different environments and with multiple tools.

Hands-free control of mobile phones, speakers and even vehicles is now possible using voice recognition technologies.

It is a breakthrough that people have been thinking about and trying to achieve for decades. To put it plainly, the goal is to improve convenience and break the language barrier all over the world.

The widespread use of deep learning techniques has made automatic speech recognition (ASR) systems much more accurate in recent years. Improvements have mainly been made to general speech recognition. Geolocation-Based Language Models (Geo-LMS) are custom language models that take into account a user’s geographic location. These models allow Siri to better guess the user’s intended word order by using not only the information from the acoustic model and a common LM (such as standard ASR), but also information about the POIs in the user’s vicinity.

Why is Siri widely used as a translator?

Digital assistants and translation software work, but how do they know multiple languages? Machine learning is important, but the first process is analog.
Alex Acero, Apple’s head of voice, says people initially read passages in many languages ​​and accents. Siri compares the hand-transcribed readings with speech signals. Siri builds a language model and predicts new words and phrases.

Apple will use “dictation mode”, a text-to-speech translator, in the new language once Siri is ready. It clips, anonymizes and feeds consumer Siri phrases when they use this service. It helps Siri learn a new language and distinguish voice from noise.

Siri is released with limited capabilities. Acero believes Apple is launching Siri with only the most common questions and updating it bi-monthly with anonymized data to covertly expand its capabilities. Siri speaks 21 languages ​​in 36 countries.

Applications of speech recognition and language translation systems powered by artificial intelligence (AI) can have far-reaching positive effects in areas as diverse as government, healthcare, education, agriculture, retail, e-commerce and finance. Text-to-speech services allow text to be converted into an artificial voice that sounds very close to human speech and can be tailored to a specific service or brand.

Technology has advanced to the point where it is now possible to have a multilingual email address and input text in different languages ​​for online searches and translations. Deep neural networks (DNNs) are used to create language models to translate complex languages ​​and account for language subtleties (such as gender, politeness, and word type) in the process.

How can Siri help you translate languages ​​with “Translate” app?

Apple’s iOS lets you translate text, speech, and conversations between all languages ​​supported by the Translate app. Even without an internet connection, you can download languages ​​to translate on a device.

Before attempting to translate text, speech, or conversation, make sure you’ve chosen the languages ​​you want to translate between.

Touch the arrows next to the Search Down button and the two languages ​​you want to translate between.

You can start typing by selecting one of the languages ​​or using the microphone to speak.

Next to the language you’re translating from, you’ll see an icon that says “Input language.”

Tap the different language to change the language you want to use.

Text or speech translation.
Tap translate and choose the languages ​​you want to translate.

Text to translate: Tap Enter Text, type or copy and paste a sentence, then tap Go.

Translate your voice: Tap the “Listen” button and say something.

Note that the “Input Language” button indicates which language you are translating into. Tap the different language to change the language you want to use.

When the translation appears, do one of the following:

Tap the Play button to hear the translation.

Tip: To change how fast the voice plays, press and hold the play button.

Where are your translations stored?

Add the translations to your favorites list: tap the button that says ‘Favorite’ to do so.

Find out what a word means: Touch the Dictionary button, then touch a word to see what it means.

Tip: Swipe to see your recent history when entering full screen in the translation app

Natural Language Processing (NLP) integrates statistical, machine learning, and deep learning models with computational linguistics (the modeling of human language using rules). Combined, these tools allow computers to “understand” the whole meaning of human language, including the intent and sentiment of the speaker or writer, in the form of text or audio data.

Leave a Comment