| 注册
首页|期刊导航|现代情报|基于多粒度标签扰动的文本分类研究

基于多粒度标签扰动的文本分类研究

姚汝婧 王芳

现代情报2024,Vol.44Issue(1):25-36,12.
现代情报2024,Vol.44Issue(1):25-36,12.DOI:10.3969/j.issn.1008-0821.2024.01.003

基于多粒度标签扰动的文本分类研究

Text Classification Based on Multi-granularity Label Perturbation

姚汝婧 1王芳1

作者信息

  • 1. 南开大学商学院信息资源管理系, 天津 300071||南开大学网络社会治理研究中心, 天津 300071
  • 折叠

摘要

Abstract

[Purpose/Significance]Supervised learning algorithms based on deep learning are currently the main re-search methods for text classification.However,the training of supervised deep learning algorithms heavily relies on the ac-curacy of the sample labels.Due to the annotator's experience and subjectivity,sample labels inevitably contain noise.La-bel perturbation is an effective way to deal with noisy labels.However,noisy label learning algorithms based on label per-turbation lack effective utilization of multiple granularity information at present,which limits the performance of the algo-rithms.[Method/Process]In order to address the problem,the paper proposed a multi-granularity label perturbation al-gorithm(MGLP),which combined sample-level granularity and category-level granularity perturbation methods.The MGLP algorithm used the idea of meta-learning to learn the fusion weights of different granularity perturbation methods,which could adaptively adjust the fusion weights according to different data characteristics.[Result/Conclusion]The paper conducts experiment on three text classification datasets,including tweet sentiment classification,movie review sentiment classification,and citation intent classification.The results show that the proposed MGLP algorithm effectively improves the performance of deep learning models in text classification tasks and has broad application prospects in information organiza-tion and information analysis.

关键词

文本分类/深度学习/标签扰动/元学习/多粒度

Key words

text classification/deep learning/label perturbation/meta-learning/multi-granularity

分类

信息技术与安全科学

引用本文复制引用

姚汝婧,王芳..基于多粒度标签扰动的文本分类研究[J].现代情报,2024,44(1):25-36,12.

基金项目

国家社会科学基金重大项目"基于数据共享与知识复用的数字政府智能化治理研究"(项目编号:20ZDA039). (项目编号:20ZDA039)

现代情报

OA北大核心CHSSCDCSSCICSTPCD

1008-0821

访问量0
|
下载量0
段落导航相关论文