top of page
  • Writer's pictureVale

AI Dubbing 101: Terminology

Author: Slava, Chief Sound Designer


Speech is more than a collection of words in a sentence. We perceive words in the context of sentences and situations, so this is not a problem for us.


To train AI, we use natural language processing, understanding, and generation systems.





NLP (Natural Language Processing) is responsible for these tasks — natural language processing technology and its components: NLG, NLU, and NER.


NLU (Natural Language Understanding) - natural language understanding. This is the meaning of what is being said. The algorithm analyzes the syntax of the sentence and establishes relationships between words and phrases to determine the context of the cue.


An important component of NLU is NER (named-entity recognition), which is the selection from the speech of certain semantic "parameters" that are important for a specific task. The algorithm takes the user's replica text and extracts named entities in it: names, addresses, numbers, and other objects.


NLG (Natural Language Generating) is the formation of a response based on the recognized text. Initially, generation templates were used for this. Modern systems are increasingly using hidden Markov models and neural networks — AI learns to independently decide how the text should look.


Technically, NLU and NLG are components of NLP. Working together, they help create an experience close to communicating with a real person.

Recent Posts

See All

Audio over video dubbing is the process of replacing or adding an audio track to an existing video. This can be useful when you want to replace the original audio track with a different language, add

Dubbing in film is a process in which the soundtrack of the original film is replaced with a translated version of the script into another language. It allows audiences who do not speak the original l

English dubbed is the process of creating a film's soundtrack where the original dialogue and text are translated into English and voiced by professional actors. It allows English-speaking audiences t

bottom of page