Abstract
This chapter presents a framework for translation rule selection based on document-level semantic knowledge, particularly the gist of a document. Translation rule selection is a task of selecting appropriate translation rules for an ambiguous source-language segment. We represent the gist of a document as the topic of the document. Therefore we introduce two topic-based models for translation rule selection which incorporates global topic information into translation disambiguation. We associate each synchronous translation rule with source- and target-side topic distributions. With these topic distributions, we propose a topic dissimilarity model to select desirable (less dissimilar) rules by imposing penalties for rules with a large value of dissimilarity of their topic distributions to those of given documents. In order to encourage the use of nontopic specific translation rules, we also present a topic sensitivity model to balance translation rule selection between generic rules and topic-specific rules. Furthermore, we project target-side topic distributions onto the source-side topic model space so that we can benefit from topic information of both the source and target language. We integrate the proposed topic dissimilarity and sensitivity model into hierarchical phrase-based machine translation for synchronous translation rule selection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Section 7.4 explains why our system penalizes candidate translations with high dissimilarities.
- 2.
In order to simplify the decoder implementation, at most two nonterminals are allowed in hierarchical translation rules.
- 3.
Since the glue rule and rules of unknown words are not extracted from training data, we just set the values of the four features for these rules to zero.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Xiong, D., Zhang, M. (2015). Translation Rule Selection with Document-Level Semantic Information. In: Linguistically Motivated Statistical Machine Translation. Springer, Singapore. https://doi.org/10.1007/978-981-287-356-9_7
Download citation
DOI: https://doi.org/10.1007/978-981-287-356-9_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-355-2
Online ISBN: 978-981-287-356-9
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)