Your browser does not allow JavaScript!
JavaScript is necessary for the proper functioning of this website. Please enable JavaScript or use a modern browser.
Open Science Slovenia
Open Science
DiKUL
slv
|
eng
Search
Browse
New in RUL
About RUL
In numbers
Help
Sign in
SUPERFORMER : continual learning superposition method for text classification
ID
Zeman, Marko
(
Author
),
ID
Faganeli Pucer, Jana
(
Author
),
ID
Kononenko, Igor
(
Author
),
ID
Bosnić, Zoran
(
Author
)
PDF - Presentation file,
Download
(2,63 MB)
MD5: DFB7BF84CED0CBD6F965252450E4D560
URL - Source URL, Visit
https://www.sciencedirect.com/science/article/pii/S0893608023000527
Image galllery
Abstract
One of the biggest challenges in continual learning domains is the tendency of machine learning models to forget previously learned information over time. While overcoming this issue, the existing approaches often exploit large amounts of additional memory and apply model forgetting mitigation mechanisms which substantially prolong the training process. Therefore, we propose a novel SUPERFORMER method that alleviates model forgetting, while spending negligible additional memory and time. We tackle the continual learning challenges in a learning scenario, where we learn different tasks in a sequential order. We compare our method against several prominent continual learning methods, i.e., EWC, SI, MAS, GEM, PSP, etc. on a set of text classification tasks. We achieve the best average performance in terms of AUROC and AUPRC (0.7% and 0.9% gain on average, respectively) and the lowest training time among all the methods of comparison. On average, our method reduces the total training time by a factor of 5.4-8.5 in comparison to similarly performing methods. In terms of the additional memory, our method is on par with the most memory-efficient approaches.
Language:
English
Keywords:
deep learning
,
continual learning
,
superposition
,
transformers
Work type:
Article
Typology:
1.01 - Original Scientific Article
Organization:
FRI - Faculty of Computer and Information Science
Publication status:
Published
Publication version:
Version of Record
Year:
2023
Number of pages:
Str. 418-436
Numbering:
Vol. 161
PID:
20.500.12556/RUL-144861
UDC:
004.8
ISSN on article:
0893-6080
DOI:
10.1016/j.neunet.2023.01.040
COBISS.SI-ID:
141099267
Publication date in RUL:
17.03.2023
Views:
762
Downloads:
143
Metadata:
Cite this work
Plain text
BibTeX
EndNote XML
EndNote/Refer
RIS
ABNT
ACM Ref
AMA
APA
Chicago 17th Author-Date
Harvard
IEEE
ISO 690
MLA
Vancouver
:
Copy citation
Share:
Record is a part of a journal
Title:
Neural networks
Shortened title:
Neural netw.
Publisher:
Elsevier
ISSN:
0893-6080
COBISS.SI-ID:
26011904
Licences
License:
CC BY 4.0, Creative Commons Attribution 4.0 International
Link:
http://creativecommons.org/licenses/by/4.0/
Description:
This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.
Secondary language
Language:
Slovenian
Keywords:
globoko učenje
,
nenehno učenje
,
superpozicija
,
transformerji
Projects
Funder:
ARRS - Slovenian Research Agency
Project number:
P2-0209
Name:
Umetna inteligenca in inteligentni sistemi
Funder:
ARRS - Slovenian Research Agency
Funding programme:
Young researchers
Similar documents
Similar works from RUL:
Similar works from other Slovenian collections:
Back