Weighted cluster ensemble based on partition relevance analysis with reduction step

Ilc, Nejc

Weighted cluster ensemble based on partition relevance analysis with reduction step
ID Ilc, Nejc (Avtor)

	PDF - Predstavitvena datoteka, prenos (3,21 MB) MD5: 384C827C93012CBDECF842A9DE5AFAD1
	URL - Izvorni URL, za dostop obiščite https://ieeexplore.ieee.org/document/9119391

Izvleček

Over the last decade, the advent of the cluster ensemble framework has enabled more accurate and robust data analysis than traditional single clustering algorithms. The improved clustering of microarray data has had a particularly strong impact in the fields of genomics and medicine. However, when we bring several ensemble members together to form a consensus, low-quality data partitions can seriously compromise the final solution. One way to overcome this problem is the weighted cluster ensemble approach based on Partition Relevance Analysis (PRA), which uses internal cluster validity indices to evaluate and weight the ensemble members before the fusion. Unfortunately, the selection of appropriate validation indices for given data is far from trivial. In this paper, we propose an additional step in PRA that reduces the size of the committee of cluster validation indices. It does so by eliminating redundant and noisy indices using data dimensionality reduction methods. Our extension works in an unsupervised way, minimizing the amount of user intervention and required expert knowledge. We adapted three conventional consensus functions based on the principle of evidence accumulation to work with PRA weights. We demonstrate the advantages of the proposed reduction step of PRA based on extensive experiments with 25 gene expression and 15 non-genetic real-world datasets, where we compared 15 consensus functions. The source code is available at https://github.com/nejci/PRAr.

Jezik:	Angleški jezik
Ključne besede:	cluster analysis, cluster validity index, dimensionality reduction, ensemble learning, feature extraction, feature selection, gene expression, weighted ensemble
Vrsta gradiva:	Članek v reviji
Tipologija:	1.01 - Izvirni znanstveni članek
Organizacija:	FRI - Fakulteta za računalništvo in informatiko
Status publikacije:	Objavljeno
Različica publikacije:	Objavljena publikacija
Leto izida:	2020
Št. strani:	Str. 113720-113736
Številčenje:	Vol. 8
PID:	20.500.12556/RUL-125568
UDK:	004
ISSN pri članku:	2169-3536
DOI:	10.1109/ACCESS.2020.3003046
COBISS.SI-ID:	20600067
Datum objave v RUL:	25.03.2021
Število ogledov:	1130
Število prenosov:	302
Metapodatki:
:	Kopiraj citat
Objavi na:

Gradivo je del revije

Naslov:	IEEE access
Založnik:	Institute of Electrical and Electronics Engineers
ISSN:	2169-3536
COBISS.SI-ID:	519839513

Licence

Licenca:	CC BY 4.0, Creative Commons Priznanje avtorstva 4.0 Mednarodna

Povezava:	http://creativecommons.org/licenses/by/4.0/deed.sl
Opis:	To je standardna licenca Creative Commons, ki daje uporabnikom največ možnosti za nadaljnjo uporabo dela, pri čemer morajo navesti avtorja.
Začetek licenciranja:	25.03.2021

Sekundarni jezik

Jezik:	Slovenski jezik
Ključne besede:	analiza gruč, kazalci kvalitete razvrstitve, zmanševanje razsežnosti podatkov, učenje z ansamblom, izločanje značilnic, izbira značilnic, genska izraženost, utežen ansambel

Projekti

Financer:	ARRS - Agencija za raziskovalno dejavnost Republike Slovenije
Številka projekta:	P2-0241
Naslov:	Sinergetika kompleksnih sistemov in procesov

Podobna dela

Podobna dela v RUL:
Podobna dela v drugih slovenskih zbirkah:

Nazaj