izpis_h1_title_alt

Uporaba globokega učenja za pretvorbo besedila v govor
ID KONČAR, LUKA (Author), ID Bosnić, Zoran (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (744,02 KB)
MD5: B70DC2933164D98895E8903173075370

Abstract
Pretvorba besedila v govor je uporabna na različnih področjih. Z globokim učenjem lahko za glas take pretvorbe uporabimo poljubno osebo, če le imamo nekaj minut posnetkov njenega govora. Pretvorba posnetkov v nabor podatkov za učenje modelov je zamudno, zato smo izdelali programsko opremo, ki ta postopek olajša. Nato smo izdelali modele z uporabo implementacije Tacotrona in dveh vokoderjev: Griffin-Lim in WaveRNN. Na koncu smo izvedli primerjavo teh dveh vokoderjev in ugotovili, da je Griffin-Lim veliko hitrejši pri sintetiziranju govora kot WaveRNN, a je kvaliteta govora bistveno slabša.

Language:Slovenian
Keywords:globoko učenje, pretvorba besedila v govor
Work type:Bachelor thesis/paper
Typology:2.11 - Undergraduate Thesis
Organization:FRI - Faculty of Computer and Information Science
Year:2022
PID:20.500.12556/RUL-135583 This link opens in a new window
COBISS.SI-ID:102623747 This link opens in a new window
Publication date in RUL:21.03.2022
Views:509
Downloads:103
Metadata:XML RDF-CHPDL DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Deep learning for text-to-speech
Abstract:
Text-to-speech (TTS) is useful in a variety of areas. With deep learning we can use any person's voice for TTS, if only we have a few minutes of recordings of their speech. Converting the recordings into a dataset useful for model training is time consuming, so we created software that makes this process easier. We then created models using Tacotron and two vocoders: Griffin-Lim and WaveRNN. In the end we performed a comparison of these two vocoders and found that Griffin-Lim is much faster at synthesizing speech than WaveRNN, but the quality of speech is significantly worse.

Keywords:deep learning, text-to-speech

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back