Skip to main content

The translation has been generated automatically  (

Lincuatec IA proeiktua martxan da

LINGUATEC IA, a project to advance the digitalization of Aragonese, Catalan, Basque and Occitan, through artificial intelligence

2024 | January 29


  • This European trans-European project has laid the groundwork for developing artificial intelligence knowledge applicable to languages with few resources on both sides of the Pyrenees.
  • Co-financed by FEDER through the POCTEFA Programme 2021-2027 and led by Elhuyar (through its Orai artificial intelligence centre), aims to apply new generative language models in these languages



Language processing is a powerful tool for resource-poor language communities, helping to revitalize language and effectively promote its use. It is vitally important for these languages to get on with the wave of artificial intelligence, so as not to be left behind. The quality achieved in the processing of natural language is not within the reach of all languages, and collaboration is essential to develop new language resources and tools. It is necessary to increase the effort in innovation, betting on applied research in artificial intelligence in the processing of natural language.

The objective of the European project EFA 104/01-LINGUATEC IA (Artificial Intelligence), co-financed by the European Regional Development Fund through the 1st call INTERREG POCTEFA 2021-2027, is to develop artificial intelligence knowledge on new models of generative language applicable to low-resource languages and its use to advance the digitalisation of Aragonese, Catalan,

The consortium of this cross-border project, led by Elhuyar (through its artificial intelligence centre Orai) and composed of Lo Congrés Permanent de la Lenga Occitana, HITZ zentroa (UPV/EHU), Jean Jaures University of Toulouse, University of Perpignan, IKER-CNRS of Lleida The consortium consists of “high-level entities that make up a scientific community around the six languages of the Pyrenees, with the aim of recovering and revitalizing them”, says Josu Aztiria, coordinator of the LINGUATEC project. This project “contributes to the social and cultural articulation of cross-border territory, reinforcing a key element of local culture, languages” he adds.

In this sense, the entities participating in the project already work in different areas related to the processing of language, such as the development of new algorithms and neuronal architectures adapted to computational situations and limited linguistic resources. Likewise, “we want to improve the transcription, neuronal machine translation and voice synthesis systems of the Basque, Catalan, Occitan, Aragonese and their dialectal variants, which combine with French and Spanish,” says Aztiria, as well as to develop a multilingual linguistic platform of subtitling and automatic dubbing.” In addition, “we plan to create an online platform or repository with all the resources, technologies and applications we develop for the languages of the Pyrenees,” he adds.

The entities that make up this project believe that their work will be of great help “both for the research and professional community that works in the field of languages and their digitisation, and for public and private entities that will be able to improve their services and make them accessible in different languages”, and pride themselves on “providing citizens with valid resources and tools that help to communicate easier in a multilingual environment”.

The LINGUATEC IA project is not a project that is now born. This is a project that includes the witness of LINGUATEC, a previous project co-financed with POCTEFA funds, which has now been completed, in which, after three years of progress and the high level of development achieved, the partner entities took a strategic step and consolidated a network of excellence in artificial intelligence for the construction of a cross-border linguistic infrastructure.

Project co-financed by the European Regional Development Fund (ERDF)