基于神经网络的维汉翻译系统实现Implementation of Uyghur-Chinese translation system based on neural network
张胜刚,艾山·吾买尔,吐尔根·依布拉音,买合木提·买买提,米尔夏提·力提甫
摘要(Abstract):
小语种的机器翻译由于开发成本和用户规模等原因一般在开源系统的基础上实现在线服务系统。目前,神经机器翻译提供的源代码以theano编写的居多,但是theano编写的机器翻译因翻译速度较慢而无法满足用户需求。在此以如何实现基于theano的稳定维汉神经网络机器翻译系统为目标开展研究,翻译模型采用ALU神经元的多层双向网络框架,使用django实现了翻译服务接口,选用nginx+uwsgi实现了负载均衡以提高翻译速度。实验结果表明,相比于5个翻译引擎构成的系统,10个翻译引擎的系统翻译速度得到1.3~1.55倍的提高。该文的研究结果对于利用开源系统快速实现能够满足日访问量1 000万次以下的翻译系统具有重要参考价值。
关键词(KeyWords): 小语种;机器翻译;theano;神经网络;开源系统;负载均衡
基金项目(Foundation): 国家重点基础研究发展计划(2014CB340506);; 国家自然科学基金(61463048);国家自然科学基金(61331011);国家自然科学基金(61662077);国家自然科学基金(61462083);; 新疆多语种信息技术实验室开放课题(2016D03023);; “自治区青年科技创新人才培养工程”青年博士项目(QN2015BS004)~~
作者(Author): 张胜刚,艾山·吾买尔,吐尔根·依布拉音,买合木提·买买提,米尔夏提·力提甫
DOI: 10.16652/j.issn.1004-373x.2018.24.039
参考文献(References):
- [1] BROWN P F,PIETRA V J D,PIETRA S A D,et al. The mathematics of statistical machine translation:parameter estimation[J]. Computational linguistics,1993,19(2):263-311.
- [2] KOEHN P,OCH F J,MARCU D. Statistical phrase-based translation[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Stroudsburg:Association for Computational Linguistics,2003:48-54.
- [3] CHIANG D. A hierarchical phrase-based model for statistical machine translation[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Stroudsburg:Association for Computational Linguistics,2005:263-270.
- [4] KALCHBRENNER N,BLUNSOM P. Recurrent continuous translation models[C]//Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing. Washington:Association for Computational Linguistics,2013:1700-1709.
- [5] SUTSKEVER I,VINYALS O,QUOC V L. Sequence to sequence learning with neural networks[J/OL].[2014-12-14].https://arxiv.org/pdf/1409.3215v3.pdf.
- [6] BAHDANAU D, CHO K H, BENGIO Y. Neural machinetranslation by jointly learning to align and translate[C/OL].[2015-04-24]. https://wenku.baidu.com/view/68abdce7763231126fdb1151.html.
- [7] CHO K,MERRI?NBOER B V,GULCEHRE C,et al. Learn-ing phrase representations using RNN encoder-decoder for sta-tistical machine translation[J/OL].[2014-09-03]. https://arxiv.org/pdf/1406.1078v3.pdf.
- [8] GEHRING J,AULI M,GRANGIER D,et al. Convolutionalsequence to sequence learning[J].[2017-07-25]. https://arxiv.org/pdf/1705.03122.pdf.
- [9] VASWANI A,SHAZEER N,PARMAR N,et al. Attention isall you need[C/OL].[2017-12-06]. https://arxiv. org/pdf/1706.03762.pdf.
- [10]解倩倩,艾山·吾买尔,吐尔根·依布拉音,等.混合策略的汉维辅助翻译系统的设计与实现[J].现代电子技术,2017,40(20):5-9.XIE Qianqian,Hasan Wumaier,Tuergen Yibulayin,et al.Design and implementation of Chinese and Uyghur computeraided translation system based on hybrid strategy[J]. Modern electronics technique,2017,40(20):5-9.
- [11]孔金英,杨雅婷,董瑞,等.基于深度学习的维汉口语机器翻译研究[J/OL].[2017-06-12]. http://www. doc88. com/p-2186355938909.html.KONG Jinying,YANG Yating,DONG Rui,et al. Research of deep learning for Uyghur-Chinese oral machine translation[J/OL].[2017-06-12]. http://www. doc88. com/p-2186355938909.html.
- [12] WANG M,LU Z,ZHOU J,et al. Deep neural machine translation with linear associative unit[J/OL].[2017-05-02]. https://arxiv.org/pdf/1705.00861.pdf.
- [13] LUONG M T,SUTSKEVER I,LE Q V,et al. Addressingthe rare word problem in neural machine translation[J/OL].[2015-05-30]. https://arxiv.org/pdf/1410.8206v4.pdf.
- [14]刘洋.神经机器翻译前沿进展[J].计算机研究与发展,2017,54(6):1144-1149.LIU Yang. Recent advances in neural machine translation[J].Journal of computer research and development, 2017, 54(6):1144-1149.
- [15]哈里旦木·阿布都克里木,刘洋,孙茂松.神经机器翻译系统在维吾尔语汉语翻译中的性能对比[J].清华大学学报(自然科学版),2017,57(8):878-883.Halidanmu Abudukelimu,LIU Yang,SUN Maosong. Perfor-mance comparison of neural machine translation systems inUyghur-Chinese translation[J]. Journal of Tsinghua University(Science and technology),2017,57(8):878-883.