izpis_h1_title_alt

Weighted cluster ensemble based on partition relevance analysis with reduction step
ID Ilc, Nejc (Author)

.pdfPDF - Presentation file, Download (3,21 MB)
MD5: 384C827C93012CBDECF842A9DE5AFAD1
URLURL - Source URL, Visit https://ieeexplore.ieee.org/document/9119391 This link opens in a new window

Abstract
Over the last decade, the advent of the cluster ensemble framework has enabled more accurate and robust data analysis than traditional single clustering algorithms. The improved clustering of microarray data has had a particularly strong impact in the fields of genomics and medicine. However, when we bring several ensemble members together to form a consensus, low-quality data partitions can seriously compromise the final solution. One way to overcome this problem is the weighted cluster ensemble approach based on Partition Relevance Analysis (PRA), which uses internal cluster validity indices to evaluate and weight the ensemble members before the fusion. Unfortunately, the selection of appropriate validation indices for given data is far from trivial. In this paper, we propose an additional step in PRA that reduces the size of the committee of cluster validation indices. It does so by eliminating redundant and noisy indices using data dimensionality reduction methods. Our extension works in an unsupervised way, minimizing the amount of user intervention and required expert knowledge. We adapted three conventional consensus functions based on the principle of evidence accumulation to work with PRA weights. We demonstrate the advantages of the proposed reduction step of PRA based on extensive experiments with 25 gene expression and 15 non-genetic real-world datasets, where we compared 15 consensus functions. The source code is available at https://github.com/nejci/PRAr.

Language:English
Keywords:cluster analysis, cluster validity index, dimensionality reduction, ensemble learning, feature extraction, feature selection, gene expression, weighted ensemble
Work type:Article
Typology:1.01 - Original Scientific Article
Organization:FRI - Faculty of Computer and Information Science
Publication status:Published
Publication version:Version of Record
Year:2020
Number of pages:Str. 113720-113736
Numbering:Vol. 8
PID:20.500.12556/RUL-125568 This link opens in a new window
UDC:004
ISSN on article:2169-3536
DOI:10.1109/ACCESS.2020.3003046 This link opens in a new window
COBISS.SI-ID:20600067 This link opens in a new window
Publication date in RUL:25.03.2021
Views:1132
Downloads:302
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Record is a part of a journal

Title:IEEE access
Publisher:Institute of Electrical and Electronics Engineers
ISSN:2169-3536
COBISS.SI-ID:519839513 This link opens in a new window

Licences

License:CC BY 4.0, Creative Commons Attribution 4.0 International
Link:http://creativecommons.org/licenses/by/4.0/
Description:This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.
Licensing start date:25.03.2021

Secondary language

Language:Slovenian
Keywords:analiza gruč, kazalci kvalitete razvrstitve, zmanševanje razsežnosti podatkov, učenje z ansamblom, izločanje značilnic, izbira značilnic, genska izraženost, utežen ansambel

Projects

Funder:ARRS - Slovenian Research Agency
Project number:P2-0241
Name:Sinergetika kompleksnih sistemov in procesov

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back