Systems and Means of Informatics
2023, Volume 33, Issue 3, pp 117-128
MACHINE TRANSLATION BY ChatGPT: MONITORING OF OUTCOME REPRODUCIBILITY
- A. Yu. Egorova
- I. M. Zatsman
- V. O. Romanenko
Abstract
The paper considers the question of monitoring the reproducibility of the results performed by ChatGPT chatbot over a time interval for solving a mathematical task, generating code, and resolving a visual puzzle. A brief review of the experimental data for monitoring the reproducibility of the results for these three applications is given. The presented data show that the outcomes of ChatGPT when solving the same problem may change over time. At the same time, significant changes may occur in a relatively short period of time which emphasizes the need to monitor and evaluate the behavior of the ChatGPT chatbot. The main goal of the paper is to study the reproducibility of the machine translation outcomes performed by ChatGPT over a given time interval. The experimental data obtained during the monitoring of outcome reproducibility demonstrate some changes in the results including the decline in translation quality of the same text fragments over a time interval. To monitor the outcome reproducibility and evaluate the behavior of ChatGPT, a previously developed method for interval evaluation of machine translation is used.
[+] References (23)
- Rudolph, J., Sh. Tan, and S. Tan. 2023. War of the chatbots: Bard, Bing chat, ChatGPT, Ernie and beyond. The new AI gold rush and its impact on higher education. J. Applied Learning Teaching 6(1): 364-389. doi: 10.37074/'jalt.2023.6.1.23.
- Chen, L., M. Zaharia, and J. Zou. 2023. How is ChatGPT's behavior changing over time? arXiv.org. 23 p. Available at: https://arxiv.org/abs/2307.09009 (accessed August 29, 2023).
- Chen, M., J. Tworek, H. Jun, et al. 2021. Evaluating large language models trained on code. arXiv.org. 35 p. Available at: https://arxiv.org/abs/2107.03374 (accessed August 29, 2023).
- Wei, J., X. Wang, D. Schuurmans, M. Bosma, E. Chi, Q. Le, and D. Zhou. 2022. Chain-of-thought prompting elicits reasoning in large language models. arXiv.org. 43 p. Available at: https://arxiv.org/abs/2201.11903 (accessed August 29, 2023).
- Zhang, M., O. Press, W. Merrill, A. Liu, and N. A. Smith. 2023. How language model hallucinations can snowball. arXiv.org. 13 p. Available at: https://arxiv.org/ abs/2305.13534 (accessed August 29, 2023).
- De Winter, J.C.F. 2023. Can ChatGPT pass high school exams on English language comprehension. Researchgate. Preprint. Available at: https://www.researchgate. net/publication/366659237 (accessed August 29, 2023).
- Jiao, W., W. Wang, J. Huang, X. Wang, and Z. Tu. 2023. Is ChatGPT a good translator? Yes with GPT-4 as the engine. arXiv.org. 8 p. Available at: https://arxiv. org/abs/2301.08745 (accessed August 29, 2023).
- Goyal, T., J.J. Li, and G. Durrett. 2022. News summarization and evaluation in the era of GPT-3. arXiv.org. 20 p. Available at: https://arxiv.org/abs/2209.12356 (accessed August 29, 2023).
- Google Translate. Available at: https://translate.google.com (accessed August 29, 2023).
- DeepL Translate. Available at: https://www.deepl.com/translator (accessed August 29, 2023).
- Tencent TranSmart. Available at: https://transmart.qq.com/zh-CN/index (accessed August 29, 2023).
- Egorova, A. Yu., I. M. Zatsman, V. V. Kosarik, and V. A. Nuriev. 2020. Nestabil'nost' neyronnogo mashinnogo perevoda [Instability of neural machine translation]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 30(2): 124-135. doi: 10.14357/08696527200212.
- Egorova, A. Yu., I. M. Zatsman, M. G. Kruzhkov, and V. A. Nuriev. 2020. Metodika temporal'noy otsenki nestabil'nosti mashinnogo perevoda [The technique allowing for temporal estimation of the machine translation instability]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 30(3):67-80. doi: 10.14357/ 08696527200307.
- Pournaras, E. 2023. Science in the era of ChatGPT, large language models and AI: Challenges for research ethics review and how to respond. arXiv.org. 14 p. Available at: https://arxiv.org/abs/2305.15299 (accessed August 29, 2023).
- Liu, Y., T. Han, S. Ma, et al. 2023. Summary of ChatGPT-related research and perspective towards the future of large language models. arXiv.org. 21 p. Available at: https://arxiv.org/abs/2304.01852 (accessed August 29, 2023).
- Vorontsov, K. V. 2023. Iskusstvennyy intellekt: evolyutsiya idey ot Frensisa Bekona do fundamental'nykh modeley i ChatGPT [Artificial intelligence: The evolution of ideas from Francis Bacon to fundamental models and ChatGPT]. Problemy iskusstvennogo intellekta: Seminar [Problems of Artificial Intelligence: Seminar]. Moscow: FRC CSC RAS. Available at: https://ai-news.ru/2023/05/seminar_problemy_iskusstvennogo_ intellekta_voroncov_k_v.html (accessed August 29, 2023).
- Chomsky, N., I. Roberts, and J. Watumull. March 8, 2023. Noam Chomsky: The false promise of ChatGPT. The New York Times. Available at: https:// www.nytimes.com/2023/ 03/08/opinion/noam-chomsky-chatgpt-ai.html (accessed August 29, 2023).
- Solomonik, A.B. 2019. Opyt sovremennoy filosofii poznaniya [Experience of modern philosophy of knowledge]. Saint Petersburg: Aletheia. 232 p.
- Inkova-Manzotti, O. Yu. 2001. Konnektory protivopostavleniya vo frantsuzskom i russkom yazykakh. Sopostavitel'noe issledovanie [Connectors of opposition in French and Russian: A comparative study]. Moscow: Informelektro. 429 p.
- Natsional'nyy korpus russkogo yazyka [Russian National Corpus]. Availableat: http:// www.ruscorpora.ru/ (accessed August 29, 2023).
- Egorova, A. Yu., I. M. Zatsman, M. G. Kruzhkov, and V. A. Nuriev. 2020. Mashinnyy perevod: indikatornaya otsenka rezul'tatov obucheniya iskusstvennoy neyronnoy seti [Machine translation: Indicator-based evaluation of training progress in neural processing]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 30(4): 124-137. doi: 10.14357/08696527200412.
- Egorova, A. Yu., I. M. Zatsman, M. G. Kruzhkov, and V.A. Nuriev. 2021. Indikatornaya otsenka nestabil'nosti neyronnogo mashinnogo perevoda [Indicator-based evaluation of machine translation instability]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 31 (2): 139-151. doi: 10.14357/08696527210213.
- Zaliznyak, A. A., I. M. Zatsman, O. Yu. Inkova, and M. G. Kruzhkov. 2015. Nadkorpusnye bazy dannykh kak lingvisticheskiy resurs [Supracorpora databases as linguistic resource]. 7th Conference (International) "Corpus Linguistics" Proceedings. Saint Petersburg: SPbSU. 211-218.
[+] About this article
Title
MACHINE TRANSLATION BY ChatGPT: MONITORING OF OUTCOME REPRODUCIBILITY
Journal
Systems and Means of Informatics
Volume 33, Issue 3, pp 117-128
Cover Date
2023-11-10
DOI
10.14357/08696527230310
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
ChatGPT applications; monitoring; outcome reproducibility; machine translation; interval evaluation
Authors
A. Yu. Egorova , , I. M. Zatsman , and V. O. Romanenko
Author Affiliations
Federal Research Center "Computer Science and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
Moscow State Linguistic University, 38 Ostozhenka Str., Moscow 119034, Russian Federation
|