Out of the BLEU: An Error Analysis of Statistical and Neural Machine Translation of WikiHow Articles from English into Arabic

Diab, Nessma

Out of the BLEU: An Error Analysis of Statistical and Neural Machine Translation of WikiHow Articles from English into Arabic

Diab, Nessma;

Abstract

Most studies that compare the quality of Neural Machine Translation (NMT) to that of Statistical Machine Translation (SMT) rely on automatic evaluation methods, mainly the bilingual evaluation understudy (BLEU), without performing any kind of human assessment. While BLEU is a good indicator of the overall performance of MT systems, it does not offer any detailed linguistic insights into the types of errors generated by those MT models. Such insights are crucial for researchers to identify areas for improvement and for language service providers to understand how upgrading to NMT gives them better results. This paper breaks free from BLEU by conducting an error analysis that compares the performance of Google SMT and NMT engines for English-into-Arabic translation. The corpus consists of six WikiHow articles. The analysis is guided by the DQF-MQM Harmonized Error Typology which classifies translation errors into eight major categories, namely, accuracy, fluency, terminology, style, design, locale convention, verity and other (for any other issues). A fine-grained classification of translation errors as such enables the researcher to explore the error types generated by each MT model, the error types eliminated by NMT, and the new error types introduced by NMT. The paper focuses on the English-Arabic language pair because it is one of the least studied pairs in the comparative literature of SMT and NMT. The results show that NMT generates less grammatical errors and mistranslations than SMT. NMT output is more fluent and robust. However, SMT is more consistent with translating proper nouns and out-of-vocabulary words.

Other data

Title	Out of the BLEU: An Error Analysis of Statistical and Neural Machine Translation of WikiHow Articles from English into Arabic
Authors	Diab, Nessma
Keywords	DQF-MQM harmonized error typology; neural machine translation; statistical machine translation; translation quality assessment
Issue Date	Jul-2021
Publisher	Center for Developing English Language Teaching
Journal	CDELT Occasional Papers in the Development of English Education
Volume	75
Issue	1
Start page	181
End page	211
ISSN	2735-3591
DOI	10.21608/opde.2021.208437

Recommend this item

Similar Items from Core Recommender Database

Google Scholar^TM

Check

Out of the BLEU: An Error Analysis of Statistical and Neural Machine Translation of WikiHow Articles from English into Arabic

Diab, Nessma;

Abstract

Other data

Google ScholarTM

Google Scholar^TM