• Artificial Intelligence Linguists and psychologists wanted

Data is becoming

the gold, oil and silicon of the future, the great source of wealth that is sensed for the world to come

. Data, in industrial quantities, are those that have circulated for four years between the National Library of Spain (BNE) and the Barcelona Supercomputing Center (BSC), the national supercomputing center, within the marIA program. It is not a transcription error:

the I and A of the name are capitalized in a graphic game that refers to the acronym for Artificial Intelligence

. The marIA program demands data to teach the AI ​​to speak in Spanish, from the Word proofreader to the automated telephone service of any company.

Yesterday was an important day for those responsible for marIA. BSC engineers traveled to Madrid and presented their work in public together with the BNE librarians.The Secretary of State for Digitization and Artificial Intelligence, Carme Artigas, chaired the event and announced

the state investment of 30 million euros directed to the Plan of Natural Language

, which includes research from several universities, the Royal Spanish Academy and, in a preferred place, MARIA. Sources from the Ministry of Economy have explained that the program, financed by the EU, does not yet have an implementation schedule.

A lot of money for exactly what? «MarIA is a set of resources, essentially language and data models to train those models that serve as the basic infrastructure

so that Spanish can be incorporated into any AI application

that includes the language: Siri, Alexa, automatic translation programs, transcription of texts ... We have generated a basic resource for researchers to use in artificial intelligence applications », explains Marta Villegas, head of the project in Barcelona.

His job, therefore, is to create a network of millions of word relationships that, processed by computers, allow machines to know how to speak Spanish and to be able to imitate it.

Artificial Intelligence, like many humans,

learns languages ​​by listening and reading, by creating their connections

, by imitating by ear.

«There are two difficulties in a project like this.

The first is to find enough data.

These models are trained with deep neural networks that feed on big data

.

And the second is to have computational resources, sufficient computing power, "explains Villegas.

And that is where the National Library comes into the project, the great provider of information with which to feed the BSC computers.

«The National Library has taken care of the written heritage of the Spanish language since its foundation. In 2009,

we began to do the same with written Spanish on the internet because we realized that, otherwise, there would be a digital dark age

, without sources, explains Mar Pérez Morillo, director of the Digital Processes and Services Division. of the BNE. Our work is the same as always, it has not changed because of marIA. All we do is send the data we generate to the BSC so that their machines can train with them ».

Data that includes announcements, reminders of first communions, memes ...

any source that reproduces the form of a language at a specific time

. “As a professional, I find it to be an impressive and very promising project. Suddenly, we see that the great heritage we have created can be used to create research and knowledge, ”says Pérez Morillo.

Let's go to the practical applications? «The use of marIA is in any Artificial Intelligence application that uses language:

machine translation, text transcriptions and classifications

, correction, voice applications, conversational systems, summary applications ... These are applications that we use every day without let us realize it », explains Marta Villegas. Academic use, for example, is extremely interesting. We can now improve the interpretation of large masses of natural language », adds Pérez Morillo.

From there, the fantasy.

When the cars drive themselves, can we say to them "

We are going to my mother's house, but on the long road there is no rush and it is more beautiful

"?

Can we tell you in Spanish?

In Catalan, in Galician or in Basque ...?

It is reasonable to think about it.

The marIA approach includes all the languages ​​of the State and provides for public and free exposure in each phase of work, so that researchers can use it in their applications.

At the moment, only English and Mandarin are more advanced than Spanish.

According to the criteria of The Trust Project

Know more

  • Artificial intelligence

  • culture

I'm sorry, Mr. Wes Anderson, a Spaniard has gotten into your eye

The Final Interview Jorge Dezcallar: "In Syria everyone has put the spoon in and that is why the war lasts 10 years"

MusicRozalén, National Prize for Current Music at only 35 years old: "I have thought, 'But if this is not my turn now'"

See links of interest

  • La Palma direct

  • Last News

  • Holidays 2021

  • 2022 business calendar

  • How to do

  • Home THE WORLD TODAY

  • Live: Georgia - Sweden

  • Greece - Spain, live

  • Barça - Bitci Baskonia