Details

Kontekstualizacija pri raziskovalnem delu
ID Conradi, Matic (Author), ID Stankovski, Vlado (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (7,07 MB)
MD5: F8C0A75099201E2FAA2B85CAE0AFF5A9

Abstract
Eksponentna rast števila znanstvenih objav raziskovalcem otežuje sprejemanje informiranih odločitev in predstavlja ozko grlo pri avtomatizaciji raziskovalnega procesa, saj sistemi za podporo pri odločanju za svoje delovanje potrebujejo strukturirane, strojno berljive podatke. Delo naslavlja ta izziv z razvojem celovitega sistema za avtomatizirano kontekstualizacijo. Implementiramo robusten cevovod, ki zajema sistematično pridobivanje in obdelavo več deset tisoč znanstvenih člankov z repozitorija Papers with Code. Predstavimo primerjalno analizo naprednih iskalnih mehanizmov, ki temeljijo na redkih in gostih vektorskih vložitvah. Na podlagi najdenih relevantnih dokumentov se z uporabo velikih jezikovnih modelov izvede generativna ekstrakcija ključnih informacij v strukturirane četverke oblike naloga, metrika, vrednost, nabor podatkov. Evalvacija pokaže, da iskalni mehanizem z gostimi semantičnimi vložitvami statistično značilno presega ostale pristope. Sistem pri nalogi ekstrakcije podatkov doseže visoko mero F1 0.969. Rezultat dela je funkcionalni prototip, dostopen preko programskega vmesnika API, ki na podlagi poizvedbe v naravnem jeziku zagotavlja strukturiran kontekst, neposredno uporaben v avtomatiziranih sistemih za podporo pri odločanju.

Language:Slovenian
Keywords:ekstrakcija informacij, iskanje informacij, veliki jezikovni modeli, podpora pri odločanju, znanstvena literatura
Work type:Master's thesis/paper
Typology:2.09 - Master's Thesis
Organization:FRI - Faculty of Computer and Information Science
Year:2025
PID:20.500.12556/RUL-175430 This link opens in a new window
COBISS.SI-ID:255976451 This link opens in a new window
Publication date in RUL:27.10.2025
Views:284
Downloads:99
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Contextualization in research work
Abstract:
The exponential growth of scientific publications complicates informed decision-making for researchers and presents a bottleneck for automating the research process, because decision support systems require structured, machine-readable data. This work addresses the challenge by developing a comprehensive system for automated contextualization. We present a robust pipeline that includes the systematic acquisition and processing of tens of thousands of scientific articles from the Papers with Code repository. At the core of the system is a comparative analysis of advanced retrieval mechanisms based on sparse and dense embeddings. Using the most relevant retrieved documents, a large language model performs generative extraction of key information into structured quadruples of the form task, metric, value, dataset. The evaluation demonstrates that the dense semantic embedding retrieval mechanism statistically significantly outperforms the other approaches. The system achieves a high F1 score of 0.969 on the final information extraction task. The result of this work is a functional prototype, accessible via an API, which provides structured context from a natural language query, making it directly usable in automated decision support systems.

Keywords:information extraction, information retrieval, large language models, decision support, scientific literature

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back