Pre-editing English news texts for machine translation into Russian

Authors

  • Elena S. Kokanova Northern (Arctic) Federal University https://orcid.org/0000-0001-6623-5636
  • Maxim V. Berendyaev Northern (Arctic) Federal University
  • Nikolay Yu. Kulikov Northern (Arctic) Federal University

DOI:

https://doi.org/10.33910/2686-830X-2022-4-1-25-30

Keywords:

pre-editing, neural machine translation, news text, English, Russian

Abstract

The paper discusses the possible advantages of pre-editing English news texts for machine translation into Russian. Pre-editing is defined as a process of adapting source text in order to reach a better quality of machine translation. Two case studies were conducted: one in 2021 and the other one in 2022. During case studies texts from bbc.com were chosen, pre-edited and translated using neural machine translation systems. Analysing common pre-editing techniques and their impact on the result of machine translation in terms of certain error patterns we conclude that in most cases pre-editing helps to eliminate a number of errors, improve the overall quality of machine output and reduce the amount of time and efforts needed for post-editing machine translation. The conducted case study also showed that despite the fact, that machine translation systems are constantly developing and changing, it is possible to identify common peculiarities of machine translation regarding a certain style or type of text and certain language pair, analyse the error patterns and find the appropriate pre-editing techniques that will be applicable for the most of machine translation systems for many months and years. Pre-editing does not guarantee the high quality of the translation itself, but together with post-editing it allows reaching results similar or equal to human translation saving a translator’s time and efforts.

References

SOURCES

Coughlan, S. (2020) Reaching 130 million girls with no access to school. BBC News, 08.03.2020. [Online]. Available at: https://www.bbc.com/news/education-51769845 (accessed 29.08.2021). (In English)

Jarrett, Ch. (2020) Why procrastination is about managing emotions, not time. BBC Worklife, 14.05.2020. [Online]. Available at: https://www.bbc.com/worklife/article/20200121-why-procrastination-is-about-managing-emotionsnot-time (accessed 29.08.2021). (In English)

Roberts, M. (2022) Two-thirds with Omicron say they have had Covid before. BBC News, 26.01.2022. [Online]. Available at: https://www.bbc.com/news/health-60132096 (accessed 28.01.2022). (In English)

Savage, M. (2019) Swedes typically stop living with their parents earlier than anywhere else in Europe. But can leaving home at a young age have a dark side? BBC Worklife, 22.08.2019. [Online]. Available at: https://www.bbc.com/worklife/article/20190821-why-so-many-young-swedes-live-alone (accessed 29.08.2021). (In English)

REFERENCES

Analysis Overview. (2021) Memsource Help Center. [Online]. Available at: http://help.memsource.com/hc/en-us/articles/360013675760 (accessed 28.08.2021). (In English)

Barkhudarov, L. S., Kolshanskij, G. V. (1958) K voprosu o vozmozhnostyakh mashinnogo perevoda [On the possibilities of machine translation]. Voprosy Yazykoznaniya, vol. 1, pp. 129–133. (In Russian)

Dew, K. N., Turner, A. M, Choi, Yo. K et al. (2018) Development of machine translation technology for assisting health communication: A systematic review. Journal of Biomedical Informatics, vol. 85, pp. 56–67. https://doi.org/10.1016/j.jbi.2018.07.018 (In English)

Hays, D. G. (1960) Linguistic research at the RAND corporation. In: Proceedings of the National Symposium on Machine Translation (2–5 February, 1960). Los Angeles: Englewood Cliffs Publ.; N. J. Prentice-Hall Publ., pp. 13–25. (In English)

Kokanova, E. S., Berendyaev, M. V., Kulikov, N. Yu. (2019) Tipy oshibok pri nejronnom mashinnom perevode tekstov ob arkticheskikh konvoyakh [Types of errors in neural machine translation texts about Arctic convoys]. In L. Yu. Shchipitsina (ed.). Razvitie severo-arkticheskogo regiona: problemy i resheniya v gumanitarnoj sfere. Materialy Vserossijskoj nauchno-prakticheskoj konferentsii (25–27 aprelya 2019) [Development of the North-Arctic region: Problems and solutions in the Human Studies. Proceedings of the All-Russian scientific and practical conference (April 25–27, 2019)]. Arkhangelsk: Northern (Arctic) Federal University Publ., pp. 80–84. (In Russian)

Machine translation tips. (2016) IBM Cloud Docs. [Online]. Available at: https://cloud.ibm.com/docs/GlobalizationPipeline?topic=GlobalizationPipeline-globalizationpipeline_tips (accessed 28.08.2021). (In English)

Marzouk, S., Hansen-Schirra, S. (2019) Evaluation of the impact of controlled language on neural machine translation compared to other MT architectures. Machine Translation, vol. 33, no. 3, pp. 179–203. https://doi.org/10.1007/s10590-019-09233-w (In English)

Mercader-Alarcon, J., Sanchez-Matinez, F. (2016). Analysis of translation errors and evaluation of pre-editing rules for the translation of English news texts into Spanish with Lucy LT. Revista Tradumatica: Tecnologies de la Traduccio, no. 14, pp. 172–186. http://dx.doi.org/10.5565/rev/tradumatica.164 (In English)

Miyata, R., Fujita, A. (2021). Understanding pre-editing for black-box neural machine translation. In: Proceedings of the 16th conference of the European Chapter of the Association for Computational Linguistics (EACL). [S. l.]: Association for Computational Linguistics Publ., pp. 1539–1550. [Online]. Available at: https://doi.org/10.48550/arXiv.2102.02955 (accessed 28.08.2021). (In English)

MT Post-Editing Guidelines. (2010) TAUS: The Language Data Network. [Online]. Available at: https://www.taus.net/academy/best-practices/postedit-best-practices/machine-translation-post-editing-guidelines (accessed 30.09.2021). (In English)

Published

2022-06-30

Issue

Section

Practice and Theory of Translation and Interpreting