izpis_h1_title_alt

S poizvedovanjem obogatene tehnike generiranja pravnih besedil
ID Mušič, Rok (Author), ID Žitnik, Slavko (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (2,39 MB)
MD5: 5A0C11B27A1E180FC2356CA842D46859

Abstract
Slovenska zakonodaja je obsežna in pravni delavci porabijo veliko časa vsak dan za iskanje ustrezne literature. V ta namen smo raziskali uspešnost velikih jezikovnih modelov (VJM) kot pravnih asistentov. VJM-ji so uspešni v številnih nalogah, a zahtevna domenska vprašanja so ena izmed njihovih večjih pomanjkljivosti; pogosto pride do halucinacij. S poizvedovanjem obogateno generiranje besedil (RAG) je tehnika, ki zaobide pomanjkanje domenskega znanja VJM-jev tako, da na podlagi vprašanja v zakonodaji najde vsebino, s katero lahko pravilno odgovori na vprašanje. Z najdenim znanjem VJM pravilno odgovori in ne halucinira. Raziskali in implementirali smo več različnih tehnik RAG. Vse metode smo preizkusili na ročno izdelani testni množici, ki vsebuje 4 testne scenarije, s katerimi preverimo, kako uspešne so metode v različnih situacijah. Naprednejše različice RAG-a, napredni in modularen RAG, kažejo dobro uspešnost pri direktnih vprašanjih, a nižjo uspešnost za bolj splošna vprašanja kot so npr. dejanski primeri.

Language:Slovenian
Keywords:velik jezikovni model, s poizvedovanjem obogateno generiranje besedil, obdelava naravnega jezika
Work type:Bachelor thesis/paper
Organization:FRI - Faculty of Computer and Information Science
Year:2024
PID:20.500.12556/RUL-161742 This link opens in a new window
Publication date in RUL:13.09.2024
Views:64
Downloads:78
Metadata:XML RDF-CHPDL DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Retrieval-augmented generation of law texts
Abstract:
Slovenian legislation is extensive, causing legal professionals to spend a significant amount of time each day searching for relevant literature. To address this, we explored the effectiveness of large language models (LLMs) as legal assistants. LLMs have been successful in various tasks, but handling complex domain-specific questions remains one of their major weaknesses; often producing hallucinations. Retrieval-Augmented Generation (RAG) is a technique that bypasses the lack of domain knowledge in LLMs by retrieving content from legislation based on the question, allowing for accurate responses. With the retrieved knowledge, the LLM can correctly answer the question without hallucinating. We explored and implemented several different RAG techniques. All methods were tested on a manually crafted test set containing four test scenarios to evaluate how successful the methods are in various situations. More advanced versions of RAG, such as advanced and modular RAG, show good performance in direct questions but lower success in more general questions, such as real-world examples.

Keywords:large language model, retrieval augmented generation, natural language processing

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back