| 注册
首页|期刊导航|计算机技术与发展|基于反馈合并的中英文混排版面OCR技术研究

基于反馈合并的中英文混排版面OCR技术研究

任荣梓 高航

计算机技术与发展2017,Vol.27Issue(3):39-43,5.
计算机技术与发展2017,Vol.27Issue(3):39-43,5.DOI:10.3969/j.issn.1673-629X.2017.03.008

基于反馈合并的中英文混排版面OCR技术研究

Investigation on Layout Analysis Technology of Chinese and English Mixed OCR Based on Feedback Merging

任荣梓 1高航1

作者信息

  • 1. 南京航空航天大学 计算机科学与技术学院,江苏 南京 210016
  • 折叠

摘要

Abstract

So far,Optical Character Recognition ( OCR) technology has been widely applied in all aspects of social life,and a single char-acter set OCR has made a major breakthrough in the technology field. However,due to the obvious differences between Chinese and Eng-lish layout analysis,the performance of the existing English and Chinese mixed OCR technology is not satisfactory. According to the shortcomings and deficiencies of traditional OCR method,on the basis of the analysis of the segmentation technique difficulties in the study of Chinese and English mixed layout,an improved segmentation method of Chinese and English mixed layout OCR analysis based on feedback merging is proposed. Based on the comprehensive utilization of the Canny operator image binary method and median filter method for filter preprocessing,this method segments the character region twice by projection method,and has conducted the thorough re-search to the specific segmentation techniques. Experiment results show that the proposed method can be successfully separated in mixed document in Chinese,English and numeric characters. The correct rate is higher than the traditional method about 8 percentage points, which can reach 97%,effectively solving the problem of ineffective adhesion character for the traditional methods.

关键词

文字识别/中英混排/版面分析/分离

Key words

character recognition/English and Chinese mixed/layout analysis/separation

分类

信息技术与安全科学

引用本文复制引用

任荣梓,高航..基于反馈合并的中英文混排版面OCR技术研究[J].计算机技术与发展,2017,27(3):39-43,5.

基金项目

江苏省科技成果转化专项资金(BA2012023) (BA2012023)

计算机技术与发展

OACSTPCD

1673-629X

访问量0
|
下载量0
段落导航相关论文