| 注册
首页|期刊导航|智能科学与技术学报|开源社区中项目热度阶段预测方法研究——以大语言模型项目为例

开源社区中项目热度阶段预测方法研究——以大语言模型项目为例

卢小艺 陈阳 傅晓明 李聪

智能科学与技术学报2025,Vol.7Issue(4):505-516,12.
智能科学与技术学报2025,Vol.7Issue(4):505-516,12.DOI:10.11959/j.issn.2096-6652.202542

开源社区中项目热度阶段预测方法研究——以大语言模型项目为例

Research on popularity stage prediction methods for open-source community projects:a case study of large language model projects

卢小艺 1陈阳 2傅晓明 3李聪4

作者信息

  • 1. 复旦大学计算与智能创新学院,上海 200438||复旦大学未来信息创新学院自适应网络与控制研究室,上海 200438
  • 2. 复旦大学计算与智能创新学院,上海 200438||复旦大学社会智能研究中心,上海 200438
  • 3. 复旦大学社会智能研究中心,上海 200438||哥廷根大学计算机科学研究所,哥廷根 37073
  • 4. 复旦大学未来信息创新学院自适应网络与控制研究室,上海 200438||复旦大学社会智能研究中心,上海 200438
  • 折叠

摘要

Abstract

In recent years,the cumulative number of registered developers on the open-source community GitHub has ex-ceeded 100 million,leading to increasingly fierce competition for user attention among projects.Understanding the dy-namics of project popularity in open-source communities not only helps researchers and developers grasp technological trends,but also provides valuable references for resource allocation,collaboration decisions,and research investment.However,existing studies often treat popularity as a continuous numerical regression or binary classification task,over-looking the stage-wise evolutionary patterns and gradual transitions exhibited in real project popularity dynamics.Open-source large language model(LLM)projects were taken as the research object.An integrated framework for popularity stage segmentation and prediction was proposed,enabling the automatic characterization and forecasting of project popu-larity stages.Specifically,feature vectors based on growth rate and cumulative scale of project attention within fixed win-dows were extracted,and a Gaussian mixture model(GMM)for soft clustering was employed to obtain four interpretable stages of popularity with probabilistic labels.Subsequently,a prediction model based on the temporal fusion transformer(TFT)was designed,which leveraged sequential data of five behavioral indicators,such as attention,number of forks,and number of commits,to forecast the stage probability distribution in future windows.Experimental results demonstrate that the proposed method outperforms existing baselines in terms of cross-entropy,mean squared error,mean absolute error,and accuracy,while maintaining high precision across all four patterns of popularity evolution,thereby validating its gen-eralization capability.The proposed approach provides a practical and scalable framework for popularity stage prediction of open-source projects,offering strong utility and potential for applications.

关键词

开源社区/大语言模型/机器学习

Key words

open-source community/large language model/machine learning

分类

信息技术与安全科学

引用本文复制引用

卢小艺,陈阳,傅晓明,李聪..开源社区中项目热度阶段预测方法研究——以大语言模型项目为例[J].智能科学与技术学报,2025,7(4):505-516,12.

基金项目

国家自然科学基金项目(No.62173095,No.U23A20331,No.62573133,No.62072115) (No.62173095,No.U23A20331,No.62573133,No.62072115)

上海市科技创新行动计划项目(No.22510713600) The National Natural Science Foundation of China(No.62173095,No.U23A20331,No.62573133,No.62072115),Shanghai Science and Technology Innovation Action Plan Project(No.22510713600) (No.22510713600)

智能科学与技术学报

2096-6652

访问量0
|
下载量0
段落导航相关论文