gravatar

pius_majuto

Pius Albert Albert

Recently Published

Unsupervised Clustering of Breast Cancer Gene Expression Data Using R
This report details an exploratory analysis of the breastCancerVDX dataset. I applied unsupervised clustering algorithms, including Hierarchical Clustering and K-Means, to identify natural groupings within the gene expression data. The analysis uses the clValid package to evaluate cluster quality and stability rigorously. Visualization is performed using Principal Component Analysis (PCA). The key finding demonstrates that the resulting clusters strongly correspond to the known Estrogen Receptor (ER) status, achieving an accuracy of over 85% and highlighting the power of clustering to uncover meaningful biological signatures.
Document