Re-Clustering Documents to Enhance Search Accuracy with Imbalanced Abbreviation Data
Abbreviation ambiguity poses significant challenges when searching academic literature.This study evaluated the accuracy of clustering algorithms on imbalanced datasets with varying ratios of target groups.A corpus consisting of 1052 papers focused on the study of abbreviations.The "MSA" dataset bullfrog plush was clustered using TF-IDF, cosine sim