TY - JOUR
T1 - An approach based on clustering for detecting differentially expressed genes in microarray data analysis
AU - Ando, Yuki
AU - Shimokawa, Asanao
N1 - Publisher Copyright:
© (2024), (Korean Statistical Society). All rights reserved.
PY - 2024
Y1 - 2024
N2 - To identify differentially expressed genes (DEGs), researchers use a testing method for each gene. However, microarray data are often characterized by large dimensionality and a small sample size, which lead to problems such as reduced analytical power and increased number of tests. Therefore, we propose a clustering method. In this method, genes with similar expression patterns are clustered, and tests are conducted for each cluster. This method increased the sample size for each test and reduced the number of tests. In this case, we used a nonparametric permutation test in the proposed method because independence between samples cannot be assumed if there is a relationship between genes. We compared the accuracy of the proposed method with that of conventional methods. In the simulations, each method was applied to the data generated under a positive correlation between genes, and the area under the curve, power, and type-one error were calculated. The results show that the proposed method outperforms the conventional method in all cases under the simulated conditions. We also found that when independence between samples cannot be assumed, the non-parametric permutation test controls the type-one error better than the t-test.
AB - To identify differentially expressed genes (DEGs), researchers use a testing method for each gene. However, microarray data are often characterized by large dimensionality and a small sample size, which lead to problems such as reduced analytical power and increased number of tests. Therefore, we propose a clustering method. In this method, genes with similar expression patterns are clustered, and tests are conducted for each cluster. This method increased the sample size for each test and reduced the number of tests. In this case, we used a nonparametric permutation test in the proposed method because independence between samples cannot be assumed if there is a relationship between genes. We compared the accuracy of the proposed method with that of conventional methods. In the simulations, each method was applied to the data generated under a positive correlation between genes, and the area under the curve, power, and type-one error were calculated. The results show that the proposed method outperforms the conventional method in all cases under the simulated conditions. We also found that when independence between samples cannot be assumed, the non-parametric permutation test controls the type-one error better than the t-test.
KW - DEGs
KW - microarray data
KW - permutation test
KW - two group comparison
UR - https://www.scopus.com/pages/publications/85206341827
U2 - 10.29220/CSAM.2024.31.5.571
DO - 10.29220/CSAM.2024.31.5.571
M3 - Article
AN - SCOPUS:85206341827
SN - 2287-7843
VL - 31
SP - 571
EP - 584
JO - Communications for Statistical Applications and Methods
JF - Communications for Statistical Applications and Methods
IS - 5
ER -