Your browser does not allow JavaScript!
JavaScript is necessary for the proper functioning of this website. Please enable JavaScript or use a modern browser.
Repository of the University of Ljubljana
Open Science Slovenia
Open Science
DiKUL
slv
|
eng
Search
Advanced
New in RUL
About RUL
In numbers
Help
Sign in
Details
Recent advances in automatic term extraction : a comprehensive survey
ID
Tran, Hanh Thi Hong
(
Author
),
ID
Martinc, Matej
(
Author
),
ID
Caporusso, Jaya
(
Author
),
ID
Delaunay, Julien
(
Author
),
ID
Doucet, Antoine
(
Author
),
ID
Pollak, Senja
(
Author
)
PDF - Presentation file,
Download
(1,85 MB)
MD5: 09D320DE02A903CF6142EFBF5620790C
URL - Source URL, Visit
https://dl.acm.org/doi/10.1145/3787584
Image galllery
Abstract
Automatic terminology or term extraction (ATE) is a Natural Language Processing (NLP) task intended to automatically identify specialized terms present in domain-specific corpora. As units of knowledge in a speciic ield of expertise, extracted terms are not only beneicial for several terminographical tasks, but also support and improve several complex downstream tasks, e.g., information retrieval, machine translation, topic detection, and sentiment analysis. ATE systems and datasets annotated for the task at hand have been studied and developed for decades, but more recent approaches have increasingly involved novel neural systems. Despite a large amount of new research on ATE tasks, systematic survey studies covering novel neural approaches are lacking, especially when it comes to the usage of large-scale language models (LLMs). We present a comprehensive survey of neural approaches to ATE, focusing on transformer-based neural models and the recent generative approaches based on LLMs. The study also compares these systems and previous ML-based approaches, which employed feature engineering and non-neural supervised learning algorithms.
Language:
English
Keywords:
computing methodologies
,
natural language processing
,
neural networks
,
language resources
,
language models
,
transformers
,
automatic term extraction
,
ATE
,
low-resourced languages
,
monolingual
,
multilingual
,
deep learning
,
zero-shot
,
few-shot
,
transfer learning
,
prompt engineering
,
large-scale language models
,
LLMs
Work type:
Article
Typology:
1.01 - Original Scientific Article
Organization:
FRI - Faculty of Computer and Information Science
Publication status:
Published
Publication version:
Version of Record
Year:
2026
Number of pages:
34 str.
Numbering:
Vol. , no.
PID:
20.500.12556/RUL-179419
UDC:
004.89:81'322
ISSN on article:
0360-0300
DOI:
10.1145/3787584
COBISS.SI-ID:
268167171
Publication date in RUL:
13.02.2026
Views:
66
Downloads:
7
Metadata:
Cite this work
Plain text
BibTeX
EndNote XML
EndNote/Refer
RIS
ABNT
ACM Ref
AMA
APA
Chicago 17th Author-Date
Harvard
IEEE
ISO 690
MLA
Vancouver
:
Copy citation
Share:
Record is a part of a journal
Title:
ACM computing surveys
Shortened title:
ACM comput. surv.
Publisher:
Association for Computing Machinery
ISSN:
0360-0300
COBISS.SI-ID:
24841216
Licences
License:
CC BY 4.0, Creative Commons Attribution 4.0 International
Link:
http://creativecommons.org/licenses/by/4.0/
Description:
This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.
Secondary language
Language:
Slovenian
Keywords:
računalniške metodologije
,
obdelava naravnega jezika
,
nevronske mreže
,
jezikovni viri
,
jezikovni modeli
,
transformatorji
,
avtomatsko pridobivanje izrazov
,
ATE
,
jeziki z malo viri
,
enojezični
,
večjezični
,
globoko učenje
,
učenje z ničelnim številom poskusov
,
učenje z malo poskusi
,
prenosno učenje
,
hitro inženirstvo
,
jezikovni modeli velikega obsega
Projects
Funder:
ARIS - Slovenian Research and Innovation Agency
Project number:
P2-0103-2022
Name:
Tehnologije znanja
Funder:
EC - European Commission
Project number:
101186647
Name:
Centre of Excellence in Artificial Intelligence for Digital Humanities
Acronym:
AI4DH
Funder:
ARIS - Slovenian Research and Innovation Agency
Project number:
J5-50169-2023
Name:
Jezikovna dostopnost pravic socialnega varstva v Sloveniji
Funder:
ARIS - Slovenian Research and Innovation Agency
Project number:
J6-3131-2021
Name:
KOMBINATORIKA BESEDOTVORNIH OBRAZIL V SLOVENŠČINI
Similar documents
Similar works from RUL:
Similar works from other Slovenian collections:
Back