北京航空航天大学主页平台系统陈俊帆--中文主页-- A Neural Expectation-Maximization Framework for Noisy Multi-Label Text Classification

导航

陈俊帆

点赞：

陈俊帆

点赞：

论文

A Neural Expectation-Maximization Framework for Noisy Multi-Label Text Classification

发布时间：2025-10-22点击次数：

发表刊物： IEEE Transactions on Knowledge and Data Engineering (TKDE), CCF-A

摘要： Multi-label text classification (MLTC) has a wide range of real-world applications. Neural networks recently promoted the performance of MLTC models. Training these neural-network models relies on sufficient accurately labelled data. However, manually annotating large-scale multi-label text classification datasets is expensive and impractical for many applications. Weak supervision techniques have thus been developed to reduce the cost of annotating text corpus. However, these techniques introduce noisy labels into the training data and may degrade the model performance. This paper aims to deal with such noise-label problems in MLTC in both single-instance and multi-instance settings. We build a novel Neural Expectation-Maximization Framework (nEM) that combines neural networks with probabilistic modelling. The nEM framework produces text representations using neural-network text encoders and is optimized with the Expectation-Maximization algorithm. It naturally considers the noisy labels during learning by iteratively updating the model parameters and estimating the distribution of the ground-truth labels. We evaluate our nEM framework in multi-instance noisy MLTC on a benchmark relation extraction dataset constructed by distant supervision and in single-instance noisy MLTC on synthetic noisy datasets constructed by keywords supervision and label flipping. The experimental results demonstrate that nEM significantly improves upon baseline models in both single-instance and multi-instance noisy MLTC tasks. The experiment analysis suggests that our nEM framework efficiently reduces the noisy labels in MLTC datasets and significantly improves model performance.

合写作者：陈俊帆,张日崇, Jie Xu,胡春明, Yongyi Mao

论文类型：国际刊物

页面范围： 10992-11003

是否译文：否

发表时间： 2023-01-01

下一条：ContrastNet: A Contrastive Learning Framework for Few-Shot Text Classification