| 注册
首页|期刊导航|东南大学学报(自然科学版)|数据驱动的细粒度中文属性对齐方法

数据驱动的细粒度中文属性对齐方法

黄廷磊 张伟莉 梁霄 付琨

东南大学学报(自然科学版)2017,Vol.47Issue(4):660-666,7.
东南大学学报(自然科学版)2017,Vol.47Issue(4):660-666,7.DOI:10.3969/j.issn.1001-0505.2017.04.006

数据驱动的细粒度中文属性对齐方法

Data-driven method for fine-grained property alignment between Chinese open datasets

黄廷磊 1张伟莉 2梁霄 1付琨2

作者信息

  • 1. 中科院空间信息处理与应用系统技术重点实验室, 北京 100190
  • 2. 中国科学院电子学研究所, 北京 100190
  • 折叠

摘要

Abstract

In order to improve the performance of property alignment between heterogeneous Chinese open datasets, a data-driven method for fine-grained alignment is proposed, which exploits the extension and domain information of properties to find equivalence, subsumption and relevance relations between properties in a unified way.First, the data types of properties are determined utilizing statistical theory, and a type-aware metric is given to calculate the similarity of properties.Based on that, the property relation recognition is modeled as a multi-classification problem, and effective features are generated to represent different property relationships and construct the random forest classifier.The experimental results show that, the proposed method can reach a precision of 94.6% in determining data types of properties, and the final F1 measures in recognizing equivalent, subsumptive and relevant properties are 71.3%, 57.3% and 59.9%, respectively.Compared with the traditional approaches that only focus on equivalent properties, the fine-grained property alignment method can improve the precision in recognizing equivalent properties, and recognize subsumptive and relevant properties, proving its effectiveness on Chinese open datasets.

关键词

中文属性对齐/属性数据类型判别/属性相似度/异构数据集成/知识图谱构建

Key words

Chinese property alignment/property data type determination/similarity of properties/heterogeneous data integration/construction of knowledge graphs

分类

信息技术与安全科学

引用本文复制引用

黄廷磊,张伟莉,梁霄,付琨..数据驱动的细粒度中文属性对齐方法[J].东南大学学报(自然科学版),2017,47(4):660-666,7.

基金项目

国家高技术研究发展计划(863计划)资助项目(2012AA011005). (863计划)

东南大学学报(自然科学版)

OA北大核心CSCDCSTPCD

1001-0505

访问量2
|
下载量0
段落导航相关论文