This thesis deals with the text classification on the problem of classifying manually written and automatically generated articles. We tested various convolutional and recurrent deep neural network architectures and various text representations. Models were tested on a dataset of manually written and automatically generated articles about weight loss. Best results were achieved
with a model using the BLSTM architecture and word2vec word embeddings. With this model, we
achieved 96,71% classification accuracy on the test dataset of manually written articles, 100% classification accuracy on articles generated with the bad model and 97,41% classification accuracy on articles generated with the good model.
|