A Density-Weighted Information Gain Tree for Clustering Mixed-Type Data

研究成果: Conference contribution査読

抄録

Clustering mixed-type data, which includes both continuous and categorical features, presents significant challenges due to the distinct nature of these data types. Many traditional distance-based and density-based methods struggle with mixed-type data because they are not designed to handle continuous and categorical features simultaneously. To address these limitations, we propose the Density-Weighted Information Gain (DWIG) Tree algorithm, which effectively manages mixed datasets by integrating continuous and categorical features through a recursive partitioning strategy. The DWIG Tree maximizes information gain while accounting for local density variations, resulting in more accurate and interpretable clustering outcomes. Experiments on both synthetic and real-world datasets demonstrate that the DWIG Tree outperforms K-Prototypes, highlighting its superior capability to handle mixed-type data and capture natural groupings more accurately.

本文言語English
ホスト出版物のタイトルProceedings - 2024 7th International Conference on Data Science and Information Technology, DSIT 2024
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9798350384093
DOI
出版ステータスPublished - 2024
イベント7th International Conference on Data Science and Information Technology, DSIT 2024 - Nanjing, China
継続期間: 20 12月 202422 12月 2024

出版物シリーズ

名前Proceedings - 2024 7th International Conference on Data Science and Information Technology, DSIT 2024

Conference

Conference7th International Conference on Data Science and Information Technology, DSIT 2024
国/地域China
CityNanjing
Period20/12/2422/12/24

フィンガープリント

「A Density-Weighted Information Gain Tree for Clustering Mixed-Type Data」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル