DeepAM: Deep Semantic Address Representation for Address Matching

Shan, Shuangli; Li, Zhixu; Qiang, Yang; Liu, An; Xu, Jiajie; Chen, Zhigang

doi:10.1007/978-3-030-26072-9_4

Shuangli Shan^14,15,
Zhixu Li^14,16,
Yang Qiang¹⁷,
An Liu¹⁴,
Jiajie Xu¹⁴ &
…
Zhigang Chen¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11641))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data

1464 Accesses

Abstract

Address matching is a crucial task in various location-based businesses like take-out services and express delivery, which aims at identifying addresses referring to the same location in address databases. It is a challenging one due to various possible ways to express the address of a location, especially in Chinese. Traditional address matching approaches relying on string similarities and learning matching rules to identify addresses referring to the same location, could hardly solve the cases with redundant, incomplete or unusual expression of addresses. In this paper, we propose to map every address into a fixed-size vector in the same vector space using state-of-the-art deep sentence representation techniques and then measure the semantic similarity between addresses in this vector space. The attention mechanism is also applied to the model to highlight important features of addresses in their semantic representations. Last but not least, we novelly propose to get rich contexts for addresses from the web through web search engines, which could strongly enrich the semantic meaning of addresses that could be learned. Our empirical study conducted on two real-world address datasets demonstrates that our approach greatly improves both precision (up to 5%) and recall (up to 8%) of the state-of-the-art existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Cheng, C., Yu, B.: A rule-based segmenting and matching method for fuzzy Chinese addresses. Geogr. Geo-Inf. Sci. 3, 007 (2011)
Google Scholar
Ding, Z., Zhang, Z., Li, J.: Improvement on reverse directional maximum matching method based on hash structure for Chinese word segmentation. Comput. Eng. Des. 29(12), 3208–3211 (2008)
Google Scholar
Drummond, W.J.: Address matching: GIS technology for mapping human activity patterns. J. Am. Plan. Assoc. 61(2), 240–251 (1995)
Article Google Scholar
Guo, H., Zhu, H., Guo, Z., Zhang, X., Su, Z.: Address standardization with latent semantic association. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1155–1164. ACM (2009)
Google Scholar
Hochreiter, S., Schmidhuber, J.: LSTM can solve hard long time lag problems. In: Advances in Neural Information Processing Systems, pp. 473–479 (1997)
Google Scholar
Kaleem, A., Ghori, K.M., Khanzada, Z., Malik, M.N.: Address standardization using supervised machine learning. Interpretation 1(2), 10 (2011)
Google Scholar
Kiros, R., et al.: Skip-thought vectors. In: Advances in Neural Information Processing Systems, pp. 3294–3302 (2015)
Google Scholar
Kothari, G., Faruquie, T.A., Subramaniam, L.V., Prasad, K.H., Mohania, M.K.: Transfer of supervision for improved address standardization. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 2178–2181. IEEE (2010)
Google Scholar
Li, D., Wang, S., Mei, Z.: Approximate address matching. In: 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 264–269. IEEE (2010)
Google Scholar
Luo, M., Huang, H.: New method of Chinese address standardization based on finite state machine theory. Appl. Res. Comput. 33, 3691–3695 (2016)
Google Scholar
Mengjun, K., Qingyun, D., Mingjun, W.: A new method of Chinese address extraction based on address tree model. Acta Geodaetica et Cartographica Sinica 44(1), 99–107 (2015)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Qiu, Y., Li, H., Li, S., Jiang, Y., Hu, R., Yang, L.: Revisiting correlations between intrinsic and extrinsic evaluations of word embeddings. In: Sun, M., Liu, T., Wang, X., Liu, Z., Liu, Y. (eds.) CCL/NLP-NABD -2018. LNCS (LNAI), vol. 11221, pp. 209–221. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01716-3_18
Chapter Google Scholar
Sharma, S., Ratti, R., Arora, I., Solanki, A., Bhatt, G.: Automated parsing of geographical addresses: a multilayer feedforward neural network based approach. In: 2018 IEEE 12th International Conference on Semantic Computing (ICSC), pp. 123–130. IEEE (2018)
Google Scholar
Song, Z.: Address matching algorithm based on Chinese natural language understanding. J. Remote Sens. 17(4), 788–801 (2013)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Google Scholar
The Theano Development Team, et al.: Theano: a python framework for fast computation of mathematical expressions (2016)
Google Scholar
Tian, Q., Ren, F., Hu, T., Liu, J., Li, R., Du, Q.: Using an optimized Chinese address matching method to develop a geocoding service: a case study of Shenzhen, China. ISPRS Int. J. Geo-Inf. 5(5), 65 (2016)
Article Google Scholar
Yong, W., Jiping, L., Qingsheng, G., An, L.: The standardization method of address information for POIs from internet based on positional relation. Acta Geodaetica et Cartographica Sinica 45(5), 623–630 (2016)
Google Scholar

Download references

Acknowledgments

This research is partially supported by National Natural Science Foundation of China (Grant No. 61632016, 61572336, 61572335, 61772356), the Natural Science Research Project of Jiangsu Higher Education Institution (No. 17KJA520003, 18KJA520010), and the Open Program of Neusoft Corporation (No. SKLSAOP1801).

Author information

Authors and Affiliations

Institute of Artificial Intelligence, School of Computer Science and Technology, Soochow University, Suzhou, China
Shuangli Shan, Zhixu Li, An Liu & Jiajie Xu
Neusoft Corporation, Shenyang, China
Shuangli Shan
IFLYTEK Research, Suzhou, China
Zhixu Li
King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
Yang Qiang
State Key Laboratory of Cognitive Intelligence, iFLYTEK, Hefei, People’s Republic of China
Zhigang Chen

Authors

Shuangli Shan
View author publications
You can also search for this author in PubMed Google Scholar
Zhixu Li
View author publications
You can also search for this author in PubMed Google Scholar
Yang Qiang
View author publications
You can also search for this author in PubMed Google Scholar
An Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jiajie Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhixu Li .

Editor information

Editors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Jie Shao
Hong Kong Polytechnic University, Hong Kong, China
Man Lung Yiu
The University of Tokyo, Tokyo, Japan
Masashi Toyoda
Zhejiang University, Hangzhou, China
Dongxiang Zhang
National University of Singapore, Singapore, Singapore
Wei Wang
Peking University, Beijing, China
Bin Cui

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shan, S., Li, Z., Qiang, Y., Liu, A., Xu, J., Chen, Z. (2019). DeepAM: Deep Semantic Address Representation for Address Matching. In: Shao, J., Yiu, M., Toyoda, M., Zhang, D., Wang, W., Cui, B. (eds) Web and Big Data. APWeb-WAIM 2019. Lecture Notes in Computer Science(), vol 11641. Springer, Cham. https://doi.org/10.1007/978-3-030-26072-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-26072-9_4
Published: 18 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26071-2
Online ISBN: 978-3-030-26072-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics