北京航空航天大学主页平台系统陈俊帆--中文主页-- Adversarial Word Dilution as Text Data Augmentation in Low-Resource Regime

导航

陈俊帆

点赞：

陈俊帆

点赞：

论文

Adversarial Word Dilution as Text Data Augmentation in Low-Resource Regime

发布时间：2025-10-22点击次数：

发表刊物： Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), CCF-A

摘要： Data augmentation is widely used in text classification, especially in the low-resource regime where a few examples for each class are available during training. Despite the success, generating data augmentations as hard positive examples that may increase their effectiveness is under-explored. This paper proposes an Adversarial Word Dilution (AWD) method that can generate hard positive examples as text data augmentations to train the low-resource text classification model efficiently. Our idea of augmenting the text data is to dilute the embedding of strong positive words by weighted mixing with unknown-word embedding, making the augmented inputs hard to be recognized as positive by the classification model. We adversarially learn the dilution weights through a constrained min-max optimization process with the guidance of the labels. Empirical studies on three benchmark datasets show that AWD can generate more effective data augmentations and outperform the state-of-the-art text data augmentation methods. The additional analysis demonstrates that the data augmentations generated by AWD are interpretable and can flexibly extend to new examples without further training.

合写作者：陈俊帆,张日崇, Zheyan Luo,胡春明, Yongyi Mao

论文类型：国际学术会议

页面范围： 12626-12634

是否译文：否

发表时间： 2023-01-01

上一条：SPContrastNet: A Self-Paced Contrastive Learning Model for Few-Shot Text Classification 下一条：Open-Set Semi-Supervised Text Classification with Latent Outlier Softening