Tvorjenje besedil z uporabo skritega markovskega modela

FILEJ, MIHA

Repository of the University of Ljubljana

Details

Tvorjenje besedil z uporabo skritega markovskega modela
ID FILEJ, MIHA (Author), ID Brodnik, Andrej (Mentor) More about this mentor... This link opens in a new window

PDF - Presentation file, Download (1,25 MB)
MD5: 1E2D7BDEC2634BFCCFB9197B29C4EE2E
PID: 20.500.12556/rul/d5202936-452d-4e71-8724-5cffed2d9659

Abstract

Področje NLG se ukvarja s tvorjenjem naravno zvenečih besedil. Cilj diplomskega dela je ugotoviti, do kolikšne mere lahko kompleksna pravila tvorjenja naravnega jezika posnemamo s statističnimi sistemi, natančneje s skritimi markovskimi modeli. Delo predstavi potrebno teoretično podlago za obstoj skritih markovskih modelov in opiše njihovo uporabo pri tvorjenju besedil. V okviru diplomskega dela je opravljen tudi pregled obstoječih orodij za delo s skriti markovskimi modeli, medsebojna primerjava orodij in pregled njihove primernosti za uporabo pri tvorjenju besedil. Opisan je postopek implementacije knjižnice za delo s skritimi markovskimi modeli v programskem jeziku Elixir. Dve izmed pregledanih orodij in implementirana knjižnica so uporabljeni za tvorjenje besedil na podlagi korpusa slovenskega pisnega jezika. Izbere se kriterij za primerjavo tvorjenih besedil, ki se uporabi za primerjavo modelov, kot tudi za primerjavo tvorjenih besedil s korpusom.

Language:	Slovenian
Keywords:	tvorjenje naravnega jezika, skriti markovski modeli, algoritem Baum-Welch, algoritem Forward-Backward, algoritem EM, Elixir, Erlang/OTP
Work type:	Undergraduate thesis
Organization:	FRI - Faculty of Computer and Information Science
Year:	2016
PID:	20.500.12556/RUL-85641
Publication date in RUL:	19.09.2016
Views:	2945
Downloads:	509
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	Text Generation using Hidden Markov Model
Natural language generation (NLG) is the task of producing text that feels natural to the reader. The goal of this diploma thesis is to study to which level natural language generation can be achieved using statistical models – specifically hidden Markov models. The diploma thesis covers probability and information theories that allow the definition of hidden Markov models and describes how such models can be used for the purpose of text generation. Available tools for working with hidden markov models are reviewed, compared, and assesed for their suitability for generating text. A library for hidden Markov models is implemented in Elixir. Two of the reviewed tools and the implemented library are used to generate text from a corpus of written slovenian language. A criterion for comparing generated texts is chosen and used to compare the models as well as comparing the generated texts to the corpus.
Keywords:	natural language generation, hidden markov models, Baum-Welch algorithm, Forward-Backward algorithm, expectation–maximization algorithm, Elixir, Erlang/OTP

Similar works from RUL:
Similar works from other Slovenian collections:

Details

Secondary language

Similar documents