Details

Razvoj klepetalnika za diagnozo bolezni
ID Pintar, Žiga (Author), ID Bajec, Marko (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (1,56 MB)
MD5: B1E9D183C565AF917C490B4E8E9348DF

Abstract
S staranjem prebivalstva prihaja do upokojevanja kvalificiranega zdravstvenega osebja na primarni ravni, kar posledično vodi do preobremenjenosti preostalega kadra. Naše magistrsko delo je bilo usmerjeno v zmanjšanje obiskov nenujnih pacientov v javnih zdravstvenih zavodih. Ustvarili smo klepetalnik ZdravBOT, ki omogoča okvirno diagnozo 24 bolezenskih stanj na podlagi uporabnikove klinične slike. Zaradi težke dostopnosti medicinskih podatkovnih zbirk smo se odločili prevesti eno od javno dostopnih angleški zbirk v slovenski jezik. Dobljeno podatkovno zbirko, ki je vsebovala okoli 88 tisoč stavkov, smo uporabili za učenje svojega modela BERT, ki je temeljil na modelu SloBERT. Ustvarjeni model smo nato uporabili v klepetalniku Rasa, ki je skozi pogovor od uporabnika zbral vse potrebne informacije ter poskušal najti okvirno diagnozo, ki bi imela dovolj veliko stopnjo zaupanja. Iskanje diagnoze je temeljilo na algoritmu kosinusne podobnosti med klinično sliko pacienta in znanimi boleznimi. Za grafični vmesnik smo ustvarili mobilno aplikacijo Android, ki se je povezovala na prej omenjeni klepetalnik Rasa. Rezultati učenja modela BERT so pokazali, da samo prevedene podatkovne zbirke morda niso dovolj, saj je pri učenju modela prišlo do prileganja podatkom. Ne glede na prej omenjene težave je naš model BERT uspešno prepoznal v povprečju od en do dva simptoma iz uporabnikovega uvodnega sporočila. Med testiranjem smo opravili 87 diagnoz, od katerih jih je bilo 62 % pravilnih. Največkrat smo se zmotili pri diagnosticiranju skupin bolezni, ki so imele veliko sorodnih simptomov, kar je bilo najpogosteje pri pljučnih in infekcijskih boleznih. Prav tako smo opazili, da algoritem kosinusne podobnosti ni najbolj optimalen način iskanja ujemanja bolezenskih stanj, saj zaradi različnega števila simptomov preferira bolezni z manjšim številom le-teh.

Language:Slovenian
Keywords:klepetalnik, RASA, BERT, obdelava naravnega jezika
Work type:Master's thesis/paper
Organization:FRI - Faculty of Computer and Information Science
Year:2025
PID:20.500.12556/RUL-177761 This link opens in a new window
Publication date in RUL:06.01.2026
Views:61
Downloads:3
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Development of a chatbot for disease diagnosis
Abstract:
Due to our society aging, there is an increase of healthcare personnel who are retiring, which in turn leads to additional workload for the remaining staff. Our master’s thesis was aimed at reducing non-emergency patient walk ins in public health facilities. Thus, we have created a chatbot named ZdraBOT, which can provide a rough diagnosis for 24 diseases based on the user’s current symptoms. Since medical databases are not publicly available especially in uncommon languages like Slovene, we decided to translate one of the publicly available English collections into Slovenian language. The resulting dataset which contained 88 thousand sentences, was used to train our BERT model, which was based on the SloBERT model. We then used the created model inside Rasa client, which gathered all the necessary information from the user, whit which we then tried to find an approximate diagnosis with a sufficiently large level of confidence. The diagnosis was made using the cosine similarity algorithm between the user’s symptoms and the 24 known diseases. For the user interface, we created an Android application which connected to the previously mentioned Rasa client. What we found during the BERT training phase is that the translated dataset alone might not be enough, as the model seemed too overfit to the data provided. Regardless of the aforementioned issues, our BERT model was able to on average identify one to two symptoms form the user’s messages. During testing we made 87 diagnoses, of which 62 % were correct. In most cases the incorrect diagnosis was chosen because many of the symptoms were overlapping for diseases in the same group. This was most obvious for pulmonary and infectious diseases. We also noticed that the cosine similarity algorithm is not the best option for matching diseases whit the user’s symptoms, because due to the different number of symptoms, it favours diseases with a smaller number of symptoms.

Keywords:chatbot, RASA, BERT, natural language procesing

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back