导航
登录 English
陈俊帆
点赞:
陈俊帆
点赞:
论文
Prototype-Guided Pseudo Labeling for Semi-Supervised Text Classification
发布时间:2025-10-22点击次数:
发表刊物: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), CCF-A
摘要: The semi-supervised text classification (SSTC) task aims at training text classification models with a few labeled data and massive unlabeled data. Recent works achieve this task by pseudo-labeling methods that assign pseudo-labels to unlabeled data as additional supervision. However, these models may suffer from incorrect pseudo-labels caused by underfitting of decision boundaries and generating biased pseudo-labels on imbalanced data. We propose a prototype-guided semi-supervised model to address the above problems, which integrates a prototype-anchored contrasting strategy and a prototype-guided pseudo-labeling strategy. Particularly, the prototype-anchored constrasting constructs prototypes to cluster text representations with the same class, forcing them to be high-density distributed, thus alleviating the underfitting of decision boundaries. And the prototype-guided pseudo-labeling selects reliable pseudo-labeled data around prototypes based on data distribution, thus alleviating the bias from imbalanced data. Empirical results on 4 commonly-used datasets demonstrate that our model is effective and outperforms state-of-the- art methods.
合写作者: Weiyi Yang,张日崇,陈俊帆, Lihong Wang, Jaein Kim
论文类型: 国际学术会议
页面范围: 16369-16382
是否译文:
发表时间: 2023-01-01