Competitor Mining from Web Encyclopedia: A Graph Embedding Approach

Hong, Xin; Jin, Peiquan; Mu, Lin; Zhao, Jie; Wan, Shouhong

doi:10.1007/978-3-030-62005-9_5

Xin Hong^13,14,
Peiquan Jin^13,14,
Lin Mu^13,14,
Jie Zhao¹⁵ &
…
Shouhong Wan^13,14

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12342))

Included in the following conference series:

International Conference on Web Information Systems Engineering

Abstract

Mining competitors from the web has been a valuable and emerging topic in big data and business analytics. While normal web pages may include incredible information like fake news, in this paper, we aim to extract competitors from web encyclopedia like Wikipedia and DBpedia, which provide more credible information. We notice that the entities in web encyclopedia can form graph structures. Motivated by this observation, we propose to extract competitors by employing a graph embedding approach. We first present a general framework for mining competitors from web encyclopedia. Then, we propose to mine competitors based on the similarity among graph nodes and further present a similarity computation method combing graph-node similarity and textual relevance. We implement the graph-embedding-based algorithm and compare the proposed method with four existing algorithms on the real data sets crawled from Wikipedia and DBpedia. The results in terms of precision, recall, and F1-measure suggest the effectiveness of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Mining Company Competitor/Collaborator Network from Online News for Competitive Intelligence

OntoBlogDis: A Knowledge-Centric Ontology Driven Socially Aware Framework for Influential Blogger Discovery

W-Metagraph2Vec: a novel approval of enriched schematic topic-driven heterogeneous information network embedding

Article 02 March 2020

References

Bao, S., Li, R., Yu, Y., Cao, Y.: Competitor Mining with the Web. IEEE Trans. Knowl. Data Eng. 20(10), 1297–1310 (2008)
Article Google Scholar
Bondarenko, A., et al.: Comparative web search questions. WSDM, 52–60 (2020)
Google Scholar
Zhao, J., Jin, P.: Conceptual modeling for competitive intelligence hiding in the internet. J. Softw. 5(4), 378–386 (2010)
Google Scholar
Zhao, J., Jin, P.: Towards the extraction of intelligence about competitor from the web. In: Lytras, M.D., et al. (eds.) Visioning and Engineering the Knowledge Society. A Web Science Perspective. Lecture Notes in Computer Science, vol. 5736, pp. 118–127. Springer, Berlin, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04754-1_13
Chapter Google Scholar
Chen, X., Wu, Y.: Web mining from competitors’ websites. In: KDD, pp. 550–555 (2005)
Google Scholar
Li, S., Lin, C., Song, Y., Li, Z.: Comparable Entity mining from comparative questions. In: ACL, pp. 650–658 (2010)
Google Scholar
Ruan, T., Xue, L., Wang, H., Pan, J.: Bootstrapping yahoo! finance by wikipedia for competitor mining. In: Qi, G., Kozaki, K., Pan, J., Yu, S. (eds.) Semantic Technology. Lecture Notes in Computer Science, vol. 9544, pp. 108–126. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-31676-5_8
Chapter Google Scholar
Lange, D., Böhm, C., Naumann, F.: Extracting Structured Information from Wikipedia Articles to Populate Infoboxes. CIKM, pp. 1661–1664 (2010)
Google Scholar
Haidar-Ahmad, L., Zouaq, A., Gagnon, M.: Automatic extraction of axioms from wikipedia using SPARQL. In: Sack, H., et al. (eds.) The Semantic Web. ESWC 2016. Lecture Notes in Computer Science, vol. 9989, pp. 60–64. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47602-5_13
Chapter Google Scholar
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. KDD, 701–710 (2014)
Google Scholar
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. KDD, 855–864 (2016)
Google Scholar
Dong, Y., Chawla, N., Swami, A.: metapath2vec: scalable representation learning for heterogeneous networks. KDD, 135–144 (2017)
Google Scholar
Haghighat, M., Li, J.: Toward fast regex pattern matching using simple patterns. In: ICPADS, pp. 662–670 (2018)
Google Scholar
Hill, B., Shaw, A.: Consider the redirect: a missing dimension of wikipedia research. OpenSym 28(1–28), 4 (2014)
Google Scholar
Tamir, R.: A random walk through human associations. In: ICDM, pp. 442–449 (2005)
Google Scholar
Pickhardt, R., et al.: A generalized language model as the combination of skipped n-grams and modified Kneser Ney smoothing. In: ACL, pp. 1145–1154 (2014)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)
Google Scholar
Sun, Y., Han, J., Yan, X., Yu, P., Wu, T.: PathSim: meta path-based top-K similarity search in heterogeneous information networks. PVLDB 4(11), 992–1003 (2011)
Google Scholar
Ni, C., Liu, K., Torzec, N.: Layered graph embedding for entity recommendation using wikipedia in the yahoo! knowledge graph. In: WWW, pp. 811–818 (2020)
Google Scholar
Fu, T., Lee, W., Lei, Z.: HIN2Vec: explore meta-paths in heterogeneous information networks for representation learning. In: CIKM, pp. 1786–1806 (2017)
Google Scholar
Zhao, J., Jin, P., Liu, Y.: Business relations in the web: semantics and a case study. J. Softw. 5(8), 826–833 (2010)
Article Google Scholar
Zhao, J., Jin, P.: Extraction and credibility evaluation of web-based competitive intelligence. J. Softw. 6(8), 1513–1520 (2011)
Google Scholar

Download references

Acknowledgement

This study is supported by the National Key Research and Development Program of China (2018YFB0704404) and the National Science Foundation of China (61672479). Peiquan Jin is the corresponding author.

Author information

Authors and Affiliations

University of Science and Technology of China, Hefei, 230027, Anhui, China
Xin Hong, Peiquan Jin, Lin Mu & Shouhong Wan
Key Laboratory of Electromagnetic Space Information, China Academy of Science, Hefei, 230027, Anhui, China
Xin Hong, Peiquan Jin, Lin Mu & Shouhong Wan
School of Business, Anhui University, Hefei, 230601, Anhui, China
Jie Zhao

Authors

Xin Hong
View author publications
You can also search for this author in PubMed Google Scholar
Peiquan Jin
View author publications
You can also search for this author in PubMed Google Scholar
Lin Mu
View author publications
You can also search for this author in PubMed Google Scholar
Jie Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shouhong Wan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peiquan Jin .

Editor information

Editors and Affiliations

VU Amsterdam, Amsterdam, The Netherlands
Zhisheng Huang
VU Amsterdam, Amsterdam, The Netherlands
Wouter Beek
Victoria University, Melbourne, VIC, Australia
Hua Wang
Swinburne University of Technology, Hawthorn, VIC, Australia
Rui Zhou
Victoria University, Melbourne, VIC, Australia
Yanchun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hong, X., Jin, P., Mu, L., Zhao, J., Wan, S. (2020). Competitor Mining from Web Encyclopedia: A Graph Embedding Approach. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2020. WISE 2020. Lecture Notes in Computer Science(), vol 12342. Springer, Cham. https://doi.org/10.1007/978-3-030-62005-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-62005-9_5
Published: 18 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62004-2
Online ISBN: 978-3-030-62005-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Competitor Mining from Web Encyclopedia: A Graph Embedding Approach

Abstract

Access this chapter

Similar content being viewed by others

Mining Company Competitor/Collaborator Network from Online News for Competitive Intelligence

OntoBlogDis: A Knowledge-Centric Ontology Driven Socially Aware Framework for Influential Blogger Discovery

W-Metagraph2Vec: a novel approval of enriched schematic topic-driven heterogeneous information network embedding

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Competitor Mining from Web Encyclopedia: A Graph Embedding Approach

Abstract

Access this chapter

Similar content being viewed by others

Mining Company Competitor/Collaborator Network from Online News for Competitive Intelligence

OntoBlogDis: A Knowledge-Centric Ontology Driven Socially Aware Framework for Influential Blogger Discovery

W-Metagraph2Vec: a novel approval of enriched schematic topic-driven heterogeneous information network embedding

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation