Full Text:   <429>

Summary:  <214>

CLC number: TP391

On-line Access: 2017-12-04

Received: 2016-06-18

Revision Accepted: 2016-11-30

Crosschecked: 2017-11-03

Cited: 0

Clicked: 1649

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Chao Su

http://orcid.org/0000-0001-6771-329X

He-yan Huang

http://orcid.org/0000-0002-0320-7520

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2017 Vol.18 No.10 P.1534-1542

http://doi.org/10.1631/FITEE.1601349


Incorporating target language semantic roles into a string-to-tree translation model


Author(s):  Chao Su, Yu-hang Guo, He-yan Huang, Shu-min Shi, Chong Feng

Affiliation(s):  School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China; more

Corresponding email(s):   suchao@bit.edu.cn, hhy63@bit.edu.cn

Key Words:  Machine translation, Semantic role, Syntax tree, String-to-tree


Chao Su, Yu-hang Guo, He-yan Huang, Shu-min Shi, Chong Feng. Incorporating target language semantic roles into a string-to-tree translation model[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(10): 1534-1542.

@article{title="Incorporating target language semantic roles into a string-to-tree translation model",
author="Chao Su, Yu-hang Guo, He-yan Huang, Shu-min Shi, Chong Feng",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="18",
number="10",
pages="1534-1542",
year="2017",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601349"
}

%0 Journal Article
%T Incorporating target language semantic roles into a string-to-tree translation model
%A Chao Su
%A Yu-hang Guo
%A He-yan Huang
%A Shu-min Shi
%A Chong Feng
%J Frontiers of Information Technology & Electronic Engineering
%V 18
%N 10
%P 1534-1542
%@ 2095-9184
%D 2017
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601349

TY - JOUR
T1 - Incorporating target language semantic roles into a string-to-tree translation model
A1 - Chao Su
A1 - Yu-hang Guo
A1 - He-yan Huang
A1 - Shu-min Shi
A1 - Chong Feng
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 18
IS - 10
SP - 1534
EP - 1542
%@ 2095-9184
Y1 - 2017
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601349


Abstract: 
The string-to-tree model is one of the most successful syntax-based statistical machine translation (SMT) models. It models the grammaticality of the output via target-side syntax. However, it does not use any semantic information and tends to produce translations containing semantic role confusions and error chunk sequences. In this paper, we propose two methods to use semantic roles to improve the performance of the string-to-tree translation model: (1) adding role labels in the syntax tree; (2) constructing a semantic role tree, and then incorporating the syntax information into it. We then perform string-to-tree machine translation using the newly generated trees. Our methods enable the system to train and choose better translation rules using semantic information. Our experiments showed significant improvements over the state-of-the-art string-to-tree translation system on both spoken and news corpora, and the two proposed methods surpass the phrase-based system on large-scale training data.

融合目标语言端语义角色的串到树翻译模型

概要:串到树模型是统计机器翻译中最为成功的基于句法的模型之一。它通过对目标语言端句法信息进行建模,使得机器输出的译文更符合语法。然而,它并未利用任何语义信息,产生的译文仍然包含语义角色混淆和语块顺序混乱等错误。提出两种方式,利用语义角色提高串到树模型性能:(1)在句法树上添加语义角色标签;(2)先将语义角色转换成树结构,再引入句法信息。将上述两种新的树结构用于串到树机器翻译模型训练,使得系统能够利用语义信息学习或选择更好的翻译规则。实验表明,在口语和新闻两种语料上,我们的方法都超越了传统串到树翻译系统;在大规模新闻语料上,我们的方法超越了基于短语的机器翻译系统。

关键词:机器翻译;语义角色;句法树;串到树模型

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Aziz, W., Rios, M., Specia, L., 2011. Shallow semantic trees for SMT. Proc. 6th Workshop on Statistical Machine Translation, p.316-322.

[2]Baker, C.F., Fillmore, C.J., Lowe, J.B., 1998. The Berkeley Framenet Project. Proc. 17th Int. Conf. on Computational Linguistics, p.86-90.

[3]Bazrafshan, M., Gildea, D., 2013. Semantic roles for string to tree machine translation. Proc. 51st Annual Meeting of the Association for Computational Linguistics, p.419-423.

[4]Brown, P.F., Cocke, J., Pietra, S.A.D., et al., 1990. A statistical approach to machine translation. Comput. Ling., 16(2): 79-85.

[5]Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., et al., 1993. The mathematics of statistical machine translation: parameter estimation. Comput. Ling., 19(2):263-311.

[6]Chiang, D., 2005. A hierarchical phrase-based model for statistical machine translation. Proc. 43rd Annual Meeting on Association for Computational Linguistics, p.263-270.

[7]Clark, H.J., Dyer, C., Lavie, A., et al., 2011. Better hypothesis testing for statistical machine translation: controlling for optimizer instability. Proc. 49th Annual Meeting of the Association for Computational Linguistics, p.176-181.

[8]Denkowski, M., Lavie, A., 2014. Meteor universal: language specific translation evaluation for any target language. Proc. 9th Workshop on Statistical Machine Translation, p.376-380.

[9]Galley, M., Hopkins, M., Knight, K., et al., 2004. What’s in a translation rule. Proc. Human Language Technology Conf. of the North American Chapter of the Association for Computational Linguistics.

[10]Gildea, D., Jurafsky, D., 2002. Automatic labeling of semantic roles. Comput. Ling., 28(3):245-288.

[11]Huang, L., Chiang, D., 2005. Better k-best parsing. Proc. 9th Int. Workshop on Parsing Technology, p.53-64.

[12]Koehn, P., 2004. Statistical significance tests for machine translation evaluation. Proc. Conf. on Empirical Methods in Natural Language Processing, p.388-395.

[13]Koehn, P., Och, F.J., Marcu, D., 2003. Statistical phrase-based translation. Proc. Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.48-54.

[14]Koehn, P., Hoang, H., Birch, A., et al., 2007. Moses: open source toolkit for statistical machine translation. Proc. 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, p.177-180.

[15]Komachi, M., Matsumoto, Y., Nagata, M., 2006. Phrase reordering for statistical machine translation based on predicate-argument structure. Int. Workshop on Spoken Language Translation, p.77-82.

[16]Liu, D., Gildea, D., 2008. Improved tree-to-string transducer for machine translation. Proc. 3rd Workshop on Statistical Machine Translation, p.62-69.

[17]Liu, D., Gildea, D., 2010. Semantic role features for machine translation. Proc. 23rd Int. Conf. on Computational Linguistics, p.716-724.

[18]Liu, Y., Liu, Q., 2010. Joint parsing and translation. Proc. 23rd Int. Conf. on Computational Linguistics, p.707-715.

[19]Liu, Y., Liu, Q., Lin, S., 2006. Tree-to-string alignment template for statistical machine translation. Proc. 21st Int. Conf. on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, p.609-616.

[20]Marcu, D., Wang, W., Echihabi, A., et al., 2006. SPMT: Statistical machine translation with syntactified target language phrases. Proc. Conf. on Empirical Methods in Natural Language Processing, p.44-52.

[21]Meyers, A., Reeves, R., Macleod, C., et al., 2004. The nombank project: an interim report. HLT-NAACL Workshop: Frontiers in Corpus Annotation, p.24-31.

[22]Mi, H., Huang, L., Liu, Q., 2008. Forest-based translation. Proc. ACL-08: HLT, p.192-199.

[23]Och, F.J., Ney, H., 2004. The alignment template approach to statistical machine translation. Comp. Ling., 30(4):417-449.

[24]Palmer, M., Gildea, D., Kingsbury, P., 2005. The proposition bank: an annotated corpus of semantic roles. Comp. Ling., 31(1):71-106.

[25]Papineni, K., Roukos, S., Ward, T., et al., 2002. BLEU: a method for automatic evaluation of machine translation. Proc. 40th Annual Meeting on Association for Computational Linguistics, p.311-318.

[26]Petrov, S., Barrett, L., Thibaux, R., et al., 2006. Learning accurate, compact, and interpretable tree annotation. Proc. 21st Int. Conf. on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, p.433-440.

[27]Pradhan, S.S., Ward, W., Hacioglu, K., et al., 2004. Shallow semantic parsing using support vector machines. Human Language Technologies: the Annual Conf. of the North American Chapter of the Association for Computational Linguistics, p.233-240.

[28]Wu, D., 1995. Grammarless extraction of phrasal translation examples from parallel texts. Proc. 6th Int. Conf. on Theoretical and Methodological Issues in Machine Translation, p.354-372.

[29]Wu, D., 1996. A polynomial-time algorithm for statistical machine translation. Proc. 34th Annual Meeting on Association for Computational Linguistics, p.152-158.

[30]Wu, D., Fung, P., 2009. Semantic roles for SMT: a hybrid two-pass model. Proc. Human Language Technologies: the Annual Conf. North American Chapter of the Association for Computational Linguistics, p.13-16.

[31]Xiong, D., Zhang, M., Li, H., 2012. Modeling the translation of predicate-argument structure for SMT. Proc. 50th Annual Meeting of the Association for Computational Linguistics, p.902-911.

[32]Yamada, K., Knight, K., 2001. A syntax-based statistical translation model. Proc. 39th Annual Meeting on Association for Computational Linguistics, p.523-530.

[33]Zhai, F., Zhang, J., Zhou, Y., et al., 2012. Machine translation by modeling predicate-argument structure transformation. Proc. Int. Conf. on Computational Linguistics, p.3019-3036.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE