DABILENA, the Basque analysis portal used on the Internet
- The web that consults real texts as if they were dictionaries.
Elhuyar has developed the Dabilena corpus portal, available to all audiences in dabilena.elhuyar.eus. This portal allows us to search much of the texts written in the Basque language in recent years on the Internet, that is, searches on the actual use of the Basque language.
The Dabilena website has three main sections: ‘Non erabili da?’ (Where has it been used? ) ‘Nola itzuli da?’ (How has it been translated?) and ‘Zer hitzekin konbinatzen da?’ ').
In the section ‘Non erabili da?’, we can see in which web and in what context the word we seek has appeared.
In the section ‘Nola itzuli da?’, you can consult the bilingual corpus. In this way, searches can be done both in Basque and Spanish and Dabilena will show us examples of both. Firstly, it will show us a list of the translations of the word consulted and below the examples of the two languages in its context.
The section ‘Zer hitzekin konbinatzen da?’ serves to consult with which words a given word is used. For example, we can check with which words the noun aurrerapen is most used.
Comparative searches can also be made, that is, to compare the use of two variants on the Dabilena portal. So we can see, for example, which of these two words has the most presence on the Internet: Boluntary or Bolondrs.
We cannot fail to mention another very useful tool that we have included in the portal. The Dabilena portal allows, with its section ‘Corpus gehiago’, to simultaneously consult several corpus available on the Internet.
Dabilena is the perfect tool for language professionals such as translators, linguists or Basque teachers, as well as for those who create linguistic resources and develop linguistic technologies for these professionals.
This portal aims to automatically collect the necessary texts for the analysis of the Basque used on the Internet (web corpus) and, once processed with linguistic technology tools, make them available to users for consultation.
The corpus of texts are, today, the indispensable tool for several areas and activities related to language, such as lexicography, translation, language teaching, linguistics and the development of language technologies.
Nine years have passed since Elhuyar began the elaboration of a corpus web in Euskera and, since then, has not ceased in the development of technologies for collecting and exploiting corpus. In addition, the Internet has significantly increased the volume of Basque texts, which are increasingly diverse. Accordingly, Dabilena has incorporated larger corpus, new functionalities and innovative consultation systems.
Collaborators and sponsors In conclusion, we want to mention the collaborators who have accompanied us in this process. Highlight, on the one hand, the work carried out by the IXA Group in the processing of the corpus and, on the other, the sponsors that have made this work possible: [Labor Kutxa and the Foral Deputation of Gipuzkoa].