DPDRC, a novel machine learning method about the decision process for dimensionality reduction before clustering

Téléchargements

Téléchargements par mois depuis la dernière année

Dessureault, J.-S. et Massicotte, D. (2022). DPDRC, a novel machine learning method about the decision process for dimensionality reduction before clustering. AI, 3 (1). pp. 1-21. ISSN 2673-2688 DOI 10.3390/ai3010001

[thumbnail of MASSICOTTE_D_3_ED.pdf]
Prévisualisation
PDF
Télécharger (5MB) | Prévisualisation

Résumé

This paper examines the critical decision process of reducing the dimensionality of a dataset before applying a clustering algorithm. It is always a challenge to choose between extracting or selecting features. It is not obvious to evaluate the importance of the features since the most popular methods to do it are usually intended for a supervised learning technique process. This paper proposes a novel method called “Decision Process for Dimensionality Reduction before Clustering” (DPDRC). It chooses the best dimensionality reduction method (selection or extraction) according to the data scientist’s parameters and the profile of the data, aiming to apply a clustering process at the end. It uses a Feature Ranking Process Based on Silhouette Decomposition (FRSD) algorithm, a Principal Component Analysis (PCA) algorithm, and a K-means algorithm along with its metric, the Silhouette Index (SI). This paper presents five scenarios based on different parameters. This research also aims to discuss the impacts, advantages, and disadvantages of each choice that can be made in this unsupervised learning process.

Type de document: Article
Mots-clés libres: DPDRC algorithm Feature extraction Feature selection FRSD algorithm PCA algorithm K-mean algorithm Silhouette index
Date de dépôt: 09 mai 2022 20:14
Dernière modification: 09 mai 2022 20:14
Version du document déposé: Version officielle de l'éditeur
URI: https://depot-e.uqtr.ca/id/eprint/10121

Actions (administrateurs uniquement)

Éditer la notice Éditer la notice