TY - GEN
T1 - A Monte Carlo Method for Metamorphic Testing of Machine Translation Services
AU - Pesu, Daniel
AU - Zhen, Jingfeng
AU - Zhou, Zhi Quan
AU - Towey, Dave
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/5/27
Y1 - 2018/5/27
N2 - With the growing popularity of machine translation services, it has become increasingly important to be able to assess their quality. However, the test oracle problem makes it difficult to conduct automated testing. In this paper, we propose a Monte Carlo method, in combination with metamorphic testing, to overcome the oracle problem. Using this method, we assessed the quality of three popular machine translation services' namely, Google Translate, Microsoft Translator, and Youdao Translate. We set the source language to be English, and the target languages included Chinese, French, Japanese, Korean, Portuguese, Russian, Spanish, and Swedish. A sample of 33,600 observations (involving a total of 100,800 actual translations) was collected and analyzed using a 3 56 factorial design. Based on this data, our model found Google Translate to be the best (in terms of the metamorphic relation used) for each and every target language considered. A trend for Indo- European languages producing better results was also identified.
AB - With the growing popularity of machine translation services, it has become increasingly important to be able to assess their quality. However, the test oracle problem makes it difficult to conduct automated testing. In this paper, we propose a Monte Carlo method, in combination with metamorphic testing, to overcome the oracle problem. Using this method, we assessed the quality of three popular machine translation services' namely, Google Translate, Microsoft Translator, and Youdao Translate. We set the source language to be English, and the target languages included Chinese, French, Japanese, Korean, Portuguese, Russian, Spanish, and Swedish. A sample of 33,600 observations (involving a total of 100,800 actual translations) was collected and analyzed using a 3 56 factorial design. Based on this data, our model found Google Translate to be the best (in terms of the metamorphic relation used) for each and every target language considered. A trend for Indo- European languages producing better results was also identified.
KW - Machine translation quality
KW - Monte Carlo method
KW - metamorphic testing
KW - natural languages
KW - oracle problem
UR - http://www.scopus.com/inward/record.url?scp=85051257640&partnerID=8YFLogxK
U2 - 10.1145/3193977.3193980
DO - 10.1145/3193977.3193980
M3 - Conference contribution
AN - SCOPUS:85051257640
T3 - Proceedings - International Conference on Software Engineering
SP - 38
EP - 45
BT - Proceedings 2018 ACM/IEEE 3rd International Workshop on Metamorphic Testing, MET 2018
PB - IEEE Computer Society
T2 - 3rd ACM/IEEE International Workshop on Metamorphic Testing, MET 2018, held in conjunction with the 40th International Conference on Software Engineering, ICSE 2018
Y2 - 27 May 2018
ER -