RDF Multi-query Optimization Algorithm Based on Triple Pattern Reordering
Under the premise of accelerating statistics by RDF storage index and narrowing the scope of semantic pruning, the distance-based triple pattern reordering algorithm is used to obtain the optimal connection order of a single query. After converting each query to a left deep tree, the left deep tree identification algorithm is used to find the common subtree and evaluate its cost. Establish materialized view and corresponding update replacement mechanism for valuable common subtrees. While optimizing a single query to a certain extent, it improves the possibility of the existence of common result sets among multiple queries. By making full use of query sharing, the overall execution cost of query set can be reduced. The experimental results show that the algorithm of this paper has better query performance than the existing query schemes, whether it is on single query or multiple query. Especially in the case that the RDF data set is large in scale, the number of queries in the query set is large, and the query statement is more complicated, the multi-query optimization method of this paper is better.
KeywordsReordering Common subtree Materialized view Multi-query optimization
I am very grateful to my instructors Jinguang Gu, Fangfang Xu and Haidong Fu for their help and guidance.
- 1.Abadi, D.J., Marcus, A., Madden, S.R., et al.: Scalable semantic web data management using vertical partitioning. In: Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria, pp. 411–422. ACM, New York (2007)Google Scholar
- 2.Bernstcin, A., Kiefer, C., Stocker, M.: OptARQ: a SPARQL optimization approach based on triple pattern selectivity estimation. Department of Informatics, University of Zurich (2007)Google Scholar
- 3.Lv, B., Du, X., Wang, Y., et al.: SPARQL query optimization based on property correlations. J. Comput. Res. Dev. 46(S2), 494–500 (2009)Google Scholar
- 4.Ye, Y., Ouyang, D.: Optimize SPARQL by combining semantic reduction and selectivity estimation. Aata Electronica Sinica 38(5), 1205–1210 (2010)Google Scholar
- 5.Liu, C., Qu, J., Qi, G., Wang, H., Yu, Y.: Hadoop SPARQL: a Hadoop-based engine for multiple SPARQL query answering. In: ESWC (Satellite Events), pp. 474–479 (2012)Google Scholar
- 6.Anyanwu, K.: A vision for SPARQL multi-query optimization on MapReduce. In: 2013 IEEE 29th International Conference on IEEE Data Engineering Workshops (ICDEW), pp. 25–26 (2013)Google Scholar
- 7.Le, W., Kementsietsidis, A., Duan, S., et al.: Scalable multi-query optimization for SPARQL. In: ICDE, pp. 666–677 (2012)Google Scholar
- 8.Li, Y.: Research on sub-row query algorithm based on single neighborhood. Yanshan University, Qinhuangdao City (2016)Google Scholar
- 9.Zhang, C., Zhang, X.: Subgraph isomorphism of uncertain attribute graphs and its decision algorithm. Comput. Sci. 40(6), 242–246 (2013)Google Scholar
- 10.Ren, X., Wang, J.: Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. PVLDB 8(5), 617–628 (2015)Google Scholar