计算机与现代化Issue(4):65-73,9.DOI:10.3969/j.issn.1006-2475.2016.04.014
一种支持通配符查询的XML模式匹配算法
An Efficient XML Pattern Matching Algorithm for Supporting Wildcard Query
摘要
Abstract
In XML query language, the wildcard query which includes “*” can effectively meet some special query require-ments.But in the big data era, with the increasing of the XML file size and structural complexity, the existing algorithms which support wildcard query need huge amounts of memory to parse XML file and also need many single path matching operations and local result caching.Aiming at this situation, we propose a new XML pattern matching algorithm named WTwigList to solve the twig pattern containing the wildcard effectively.First, the hierarchical relationship of wildcard in the query pattern is processed to reduce unnecessary wildcard matching.Then the XML file is parsed as data stream pattern and the local Extended Dewey enco-ding is executed.After filtering operation, the ordered list of leaf node encoding is gotten, and the matching results can get from the list matching operations.A set of experimental result on both real-life and synthetic dataset demonstrates that WTwigList im-proves query efficiency andis of advantages in space efficiency, and it can deal with the P-C relationship quickly and accurately.关键词
通配符查询/流数据处理/扩展Dewey编码/XML模式匹配Key words
wildcard query/stream data processing/Extended Dewey Encoding/XML pattern matching分类
信息技术与安全科学引用本文复制引用
陈冲,蒋夏军..一种支持通配符查询的XML模式匹配算法[J].计算机与现代化,2016,(4):65-73,9.基金项目
江苏省自然科学基金资助项目 ()