Fast Dependency Parsing Using Distributed Word Representations

Le-Hong, Phuong; Nguyen, Thi-Minh-Huyen; Nguyen, Thi-Luong; Ha, My-Linh

doi:10.1007/978-3-319-25660-3_22

Fast Dependency Parsing Using Distributed Word Representations

Phuong Le-Hong^19,21,
Thi-Minh-Huyen Nguyen¹⁹,
Thi-Luong Nguyen²⁰ &
…
My-Linh Ha¹⁹

Conference paper
First Online: 26 November 2015

836 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9441))

Abstract

In this work, we propose to use distributed word representations in a greedy, transition-based dependency parsing framework. Instead of using a very large number of sparse indicator features, the multinomial logistic regression classifier employed by the parser learns and uses a small number of dense features, therefore it can work very fast. The distributed word representations are produced by a continuous skip-gram model using a neural network architecture. Experiments on a Vietnamese dependency treebank show that the parser not only works faster but also achieves better accuracy in comparison to a conventional transition-based dependency parser.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://code.google.com/p/word2vec/.

References

McDonald, R., Nivre, J.: Analyzing and integrating dependency parsers. Comput. Linguist. 37(1), 197–230 (2011)
Article Google Scholar
Chen, D., Manning, C.D.: A fast and accurate dependency parser using neural networks. In: Proceedings of EMNLP, pp. 740–750. ACL (2014)
Google Scholar
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of ACL, Uppsala, Sweden, pp. 384–394 (2010)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of AISTATS, Barbados, pp. 246–252 (2005)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of ICML, New York, NY, USA, pp. 160–167 (2008)
Google Scholar
Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21, pp. 1081–1088. Curran Associates Inc. (2009)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR, Scottsdale, Arizona, USA (2013)
Google Scholar
Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the 8th International Workshop on Parsing Technologies (IWPT 03), Nancy, France, pp. 149–160 (2003)
Google Scholar
Nivre, J., Scholz, M.: Deterministic dependency parsing of English text. In: Proceedings of COLING 2004, Geneva, Switzerland (2004)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates Inc. (2013)
Google Scholar
Andrew, G., Gao, J.: Scalable training of \(l_1\)-regularized log-linear models. In: Proceedings of ICML, Oregon State University, Corvallis, USA, pp. 33–40 (2007)
Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)
MATH Google Scholar
Le-Hong, P., Thi Minh Huyên, N., Roussanaly, A., Vinh, H.T.: A hybrid approach to word segmentation of Vietnamese texts. In: Martín-Vide, C., Fernau, H., Otto, F. (eds.) LATA 2008. LNCS, vol. 5196, pp. 240–249. Springer, Heidelberg (2008)
Chapter Google Scholar
Nguyen, T.L., Ha, M.L., Nguyen, V.H., Nguyen, T.M.H., Le-Hong, P.: Building a treebank for Vietnamese dependency parsing. In: The 10th IEEE RIVF, Hanoi, Vietnam, pp. 147–151. IEEE (2013)
Google Scholar
Nguyen, P.T., Xuan, L.V., Nguyen, T.M.H., Nguyen, V.H., Le-Hong, P.: Building a large syntactically-annotated corpus of Vietnamese. In: Proceedings of the 3rd Linguistic Annotation Workshop, ACL-IJCNLP, Suntec City, Singapore, pp. 182–185 (2009)
Google Scholar
Bohnet, B.: Very high accuracy and fast dependency parsing is not a contradiction. In: Proceedings of COLING, Beijing, China, pp. 89–97 (2010)
Google Scholar
Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: Proceedings of ACL-HLT, Columbus, Ohio, USA, pp. 595–603 (2008)
Google Scholar
Garg, N., Henderson, J.: Temporal restricted Boltzman machines for dependency parsing. In: Proceedings of ACL-HLT, Portland, Oregon, USA, pp. 11–17 (2011)
Google Scholar
Collins, M.: Head-driven statistical models for natural language parsing. Comput. Linguist. 29(4), 589–637 (2003)
Article MathSciNet MATH Google Scholar
Le-Hong, P., Roussanaly, A., Nguyen, T.M.H.: A syntactic component for Vietnamese language processing. J. Lang. Model. 3(1), 145–183 (2015)
Article Google Scholar

Download references

Acknowledgements

This research is partly funded by the Vietnam National University, Hanoi (VNU) under project number QG.15.04. The last author is funded by Hanoi University of Science (HUS) under project number TN.15.04. The authors would like to thank Dr. Dang Hoang Vu of FPT Research for providing us the distributed representations of Vietnamese words. We are grateful to our anonymous reviewers for their helpful comments which helped us improve the quality of the article in terms of both presentation and content.

Author information

Authors and Affiliations

VNU University of Science, Hanoi, Vietnam
Phuong Le-Hong, Thi-Minh-Huyen Nguyen & My-Linh Ha
Dalat University, Lamdong, Vietnam
Thi-Luong Nguyen
FPT Research, Hanoi, Vietnam
Phuong Le-Hong

Authors

Phuong Le-Hong
View author publications
You can also search for this author in PubMed Google Scholar
Thi-Minh-Huyen Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Thi-Luong Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
My-Linh Ha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Phuong Le-Hong .

Editor information

Editors and Affiliations

Institute of Infocomm Research, Singapore, Singapore
Xiao-Li Li
Ho Chi Minh City University of Tech, Ho Chi Minh City, Vietnam
Tru Cao
School of Information Systems, Singapore Management University, Singapore, Singapore
Ee-Peng Lim
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Science & Technology, Japan Advanced Institute of, Nomi-shi, Ishikawa, Japan
Tu-Bao Ho
The University of Hong Kong, Hong Kong, China
David Cheung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Le-Hong, P., Nguyen, TMH., Nguyen, TL., Ha, ML. (2015). Fast Dependency Parsing Using Distributed Word Representations. In: Li, XL., Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D. (eds) Trends and Applications in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science(), vol 9441. Springer, Cham. https://doi.org/10.1007/978-3-319-25660-3_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-25660-3_22
Published: 26 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25659-7
Online ISBN: 978-3-319-25660-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics