华东师范大学学报(自然科学版)Issue(5):1-13,13.DOI:10.3969/j.issn.1000-5641.2025.05.001
基于多维特征融合的GitHub开发者地理位置预测
Research on the GitHub developer geographic location prediction method based on multi-dimensional feature fusion
摘要
Abstract
The geographic location information of developers is important for understanding the global distribution of open source activities and formulating regional policies.However,a substantial number of developer accounts on the GitHub platform lack geographic location information,limiting the comprehensive analysis of the geographic distribution of the global open source ecosystem.This study proposed a hierarchical geographic location prediction framework based on multidimensional feature fusion.By integrating three major categories of multidimensional features—temporal behavior,linguistic culture,and network characteristics—the framework established a four-tier progressive prediction mechanism consisting of rule-driven rapid positioning,name cultural inference,time zone cross-validation,and a deep learning ensemble.Experiments conducted on a large-scale dataset built from 50 000 globally active developers demonstrated that this method successfully predicted the geographic locations of 82.52%of the developers.Among these,the name cultural inference layer covered most users with an accuracy of 0.762 9,whereas the deep learning ensemble layer handled the most complex cases with an accuracy of 0.755 7.A comparative analysis with the prediction results from the Moonshot large language model validated the superiority of the proposed method in complex geographic inference tasks.关键词
GitHub/多维特征/深度学习/地理位置预测Key words
GitHub/multi-dimensional feature/deep learning/geographic location prediction分类
信息技术与安全科学引用本文复制引用
赵思嘉,韩凡宇,王伟..基于多维特征融合的GitHub开发者地理位置预测[J].华东师范大学学报(自然科学版),2025,(5):1-13,13.基金项目
国家自然科学基金(62137001,62277017,61977026) (62137001,62277017,61977026)