计算机技术与发展2026,Vol.36Issue(4):32-40,9.DOI:10.20165/j.cnki.ISSN1673-629X.2025.0326
基于增强命名实体识别的开源许可证条款识别方法
Open Source License Term Identification Method Based on Enhanced Named Entity Recognition
摘要
Abstract
Open-source licenses have developed significantly alongside the widespread adoption of open-source software,defining the rights and obligations of open-source software users.Developers who use open-source code without adhering to the corresponding license terms may face legal risks such as infringement.Therefore,identifying open-source license terms and understanding license content plays a crucial role in risk management and intellectual property protection.Existing license term identification methods suffer from limited applicability and insufficient accuracy.We propose a license term identification method based on enhanced named entity rec-ognition(LTNER)to achieve automated and precise identification of license terms in open-source licenses.The method constructs an open-source license term identification model,leveraging BERT's contextual capture capabilities and deep semantic understanding to flexibly analyze various license texts,and uses named entity recognition to determine the terms contained in the license.Experimental results show that the LTNER model outperforms the existing state-of-the-art tool LiDetector in terms of recall and F1 score,with a 14.68%increase in recall and a 6.71%increase in F1 score,validating the model's effectiveness in identifying open-source license terms.关键词
开源软件/许可证条款/命名实体识别/BERT/条件随机场/开源治理Key words
open source software/license terms/named entity recognition/BERT/CRF/open source governance分类
信息技术与安全科学引用本文复制引用
程宇豪,黄子杰,高建华..基于增强命名实体识别的开源许可证条款识别方法[J].计算机技术与发展,2026,36(4):32-40,9.基金项目
中国博士后科学基金面上项目(2024M761927) (2024M761927)
上海市"科技创新行动"启明星项目(扬帆专项)(24YF2719900) (扬帆专项)
上海市高水平机构建设运行计划"软科学研究"青年项目(25692112700) (25692112700)