Skip to main content
VIGICOVID sistema sortu dugu

[A system of automatic extraction of information from scientific articles on COWEAK-19].

2022 | March 24

VIGICOVID is a system for obtaining answers in the information avalanche on COWEAK-19 and SBRA-CoV-2, through questions in natural language.

Researchers and researchers from the UPV/EHU, the UNED and Elhuyar have created the VIGICOVID system, thanks to the CRUE's Super-COWEAK-19 Fund. This system responds to the need to seek answers in the avalanche of information generated by all investigations conducted worldwide related to the pandemic. Through artificial intelligence, the system shows the answers found in a set of scientific articles in an orderly manner, using questions and answers in natural language.

The global biosanitary research community is making a great effort to generate knowledge about the COWEAK-19 and the SBRA-CoV-2. This effort results in an enormous and very rapid production of scientific publications, which makes it difficult to consult and analyze all this information. It is therefore necessary to provide information systems for experts and decision-makers, enabling them to acquire the necessary knowledge.

This is precisely what researchers and researchers at the HiTZ Center of the UPV/EHU, the NLP & IR Group of the UNED and the Artificial Intelligence Unit in the linguistic domain of Elhuyar have investigated in the VIGICOVID project, thanks to the Super-CO-19 Fund granted by the CRUE. In the study, they have created a prototype to extract information through questions and answers in natural language from an updated set of scientific articles published by the global research community around the CO-19 and the S↓ -CoV-2, under the coordination of the UNED research group.

“The information search paradigm is changing thanks to artificial intelligence,” says Eneko Agirre, director of the HiTZ Center at UPV/EHU. So far, to search for information on the network, a question is introduced, and the answer must be found in the documents that show us the system. However, depending on the new paradigm, the systems that offer the answer directly are increasingly widespread, without the need to read the whole document.”

In this system, “the user does not request information through keywords, but directly asks a question,” explains the researcher of Elhuyar Xabier Saralegi. The system seeks answers to this question in two phases: “First, recover the documents that may contain the answer to the question asked, using a technology that combines keywords and direct questions. For that we have researched neural architectures,” adds Dr. Saralegi. They've used deep neural architectures powered by examples: “That means search models and answer models are trained through deep machine learning.”

Once the series of documents has been extracted, they are reprocessed through a question-and-answer system to obtain concrete answers: “We have built the engine that answers the questions; providing a question and a document, the engine is able to detect whether or not the answer is found in the document, and if so, it says exactly where it is located,” explains Dr. Agirre.

A prototype easily marketable

The researchers are satisfied with the research results: “Of the techniques and evaluations we have analyzed in our experiments, we have led to the prototype those that have produced the best results,” says Elhuyar’s researcher. They have established a solid technological base, and they have published several scientific articles on it. “We have achieved another way of searching for cases of need for urgent information, which facilitates the process of consuming information. At the research level we have shown that the proposed technology works, and that the system delivers good results,” says Agirre.

“Our result is a prototype of a basic research project. This is not a commercial product,” Saralegi points out. But these kinds of prototypes can be easily and quickly modeled, so that they can be commercialized and made available to society. These researchers stress that artificial intelligence will make it possible to have increasingly powerful tools to work with large databases. “We are making very rapid progress in this area. Moreover, everything that is investigated easily reaches the market”, concludes the researcher of the UPV/EHU.

Bibliographic reference

Arantxa Otegi, Iñaki San Vicente, Xabier Saralegi, Anselmo Peñas, Borja Lozano, Eneko Agirre
Information retrieval and question answering: A case study on COIRI
-19 scientific

PHOTO: From the 12rf photographic base.