Skip to main content

Translation Rule Selection with Document-Level Semantic Information

  • Chapter
  • First Online:
Linguistically Motivated Statistical Machine Translation
  • 732 Accesses

Abstract

This chapter presents a framework for translation rule selection based on document-level semantic knowledge, particularly the gist of a document. Translation rule selection is a task of selecting appropriate translation rules for an ambiguous source-language segment. We represent the gist of a document as the topic of the document. Therefore we introduce two topic-based models for translation rule selection which incorporates global topic information into translation disambiguation. We associate each synchronous translation rule with source- and target-side topic distributions. With these topic distributions, we propose a topic dissimilarity model to select desirable (less dissimilar) rules by imposing penalties for rules with a large value of dissimilarity of their topic distributions to those of given documents. In order to encourage the use of nontopic specific translation rules, we also present a topic sensitivity model to balance translation rule selection between generic rules and topic-specific rules. Furthermore, we project target-side topic distributions onto the source-side topic model space so that we can benefit from topic information of both the source and target language. We integrate the proposed topic dissimilarity and sensitivity model into hierarchical phrase-based machine translation for synchronous translation rule selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Section 7.4 explains why our system penalizes candidate translations with high dissimilarities.

  2. 2.

    In order to simplify the decoder implementation, at most two nonterminals are allowed in hierarchical translation rules.

  3. 3.

    Since the glue rule and rules of unknown words are not extracted from training data, we just set the values of the four features for these rules to zero.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deyi Xiong .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media Singapore

About this chapter

Cite this chapter

Xiong, D., Zhang, M. (2015). Translation Rule Selection with Document-Level Semantic Information. In: Linguistically Motivated Statistical Machine Translation. Springer, Singapore. https://doi.org/10.1007/978-981-287-356-9_7

Download citation

Publish with us

Policies and ethics