广东工业大学学报2017,Vol.34Issue(3):89-95,7.DOI:10.12052/gdutxb.170029
基于领域本体的网络财务报告文本信息抽取研究
A Research on Text Information Extraction from Annual Report Based on Domain Ontology
摘要
Abstract
Significant financial information can be retrieved from the vast amount of textual data provided in Chinese business accounting reports (annual reports). Nevertheless, due to the unstructured nature, this textual information usually is difficult to be obtained and analyzed via traditional computer and database techniques. To address this issue, a set of unified domain-specific ontology is presented, combined with Chinese Natural language processing (NLP), which transforms accounting reports in unstructured text into a structured XBRL-based form via three different dimensions, namely word attribute description, word relation organization, and related knowledge links respectively.关键词
可扩展商业报告语言/领域本体/财务报告Key words
extensible business reporting language(XBRL)/domain ontology/financial report分类
信息技术与安全科学引用本文复制引用
梁倬骞,王东,朱慧,潘定..基于领域本体的网络财务报告文本信息抽取研究[J].广东工业大学学报,2017,34(3):89-95,7.基金项目
国家自然科学基金资助项目(71171097, 71671048) (71171097, 71671048)
中央高校基本科研业务费专项资金资助项目(15JNLH005) (15JNLH005)
广东省自然科学基金资助项目(2015A030310506) (2015A030310506)