TALLINN – At the 4th Language Technology Conference of the Institute of the Estonian Language, new technological breakthroughs in teaching Estonian to computers and plans for their implementation were introduced, including a plan to develop a virtual assistant communicating in Estonian for everyone to interact with the state.
"The quality of Estonian machine translation has become unbelievably good in recent years," Tanel Alumae, associate professor of language technology at Tallinn University of Technology, said in his presentation. He added that Estonian-English machine translation can already be taken as the basis for next-level language technology systems.
The same was said by Martin Eessalu, head of research infrastructure at the Estonian Ministry of Education and Research.
"Today, Estonian language technology applications are of a completely different quality than five years ago, and accordingly, people's will to use them is also great," Eessalu said.
Kristiina Vilbaste and Veronika Mugra from the Information System Authority of Estonia presented Burokratt, a plan to create a virtual assistant similar to Apple's Siri or Amazon's Alexa for communicating with the state by bringing together different new language technologies. According to the presentation, the number of areas covered by Burokratt continues to grow, and while today it can be used for text chat, in two years it will be capable of voice communication as well.
"The goal is that, for example, in order to receive grants or apply for documents, people do not have to look for forms on the websites of various institutions or write emails. The question can be put to Burokratt in simple Estonian, and in an instant the artificial intelligence will find an answer or perform a simpler action for the person on the latter's command," said Kadri Vare, head of the language technology competence center at the Institute of the Estonian Language.
Teven Le Scao, research engineer at Hugging Face, a natural language digital processing company, introduced new highly efficient machine translation models that no longer need identical texts in different languages for learning, but simply the largest possible amount of text. The researcher pointed out that the weakness of smaller languages is the limited amount of texts available on the internet, and that text masses stored in national institutions should be used to teach Estonian to the computer. The National Library is one such institution, according to Le Scao.
Arvi Tavast, director of the Institute of the Estonian Language, pointed out that in addition to the capabilities of the technology that supports the Estonian language, the speed of development of technology has also increased dramatically.
"This time, I still wrote my own presentation, it wasn't done by the machine, but we'll see what the options are a year from now," Tavast said.
The Language Technology Conference, held for the sixth time, brought together researchers, entrepreneurs and experts from several countries to present the latest advances in language technology and how they can be put to work for the benefit of the Estonian language.