The Grammar Correction: A Comparison of T5, LLAMA and ChatGPT

Authors

  • Jihan Apriliani Nurhasanah Institut Teknologi Kalimantan
  • Healty Susantiningdyah Institut Teknologi Kalimantan
  • Iwan Saputra Institut Teknologi Kalimantan
  • Ilham Rahmaddani Adhie Prayoga Institut Teknologi Kalimantan
  • Muchammad Chandra Cahyo Utomo Institut Teknologi Kalimantan

DOI:

https://doi.org/10.35718/iiair.v1i1.1308

Keywords:

Grammatical Error Correction, LLAMA2, T5, ChatGPT, Artificial Intelligence, Deep Learning, Large Language Model

Abstract

English proficiency is a crucial tool for accessing new knowledge and skills and supporting self-directed learning across platforms and curricula. However, English language mastery in Indonesia has declined in recent years, as evidenced by decreasing rankings and scores compared to the Asian average and ASEAN countries. Grammatical errors significantly impact communication effectiveness, particularly in professional and academic environments that demand clarity and precision. To address this issue, AI-based Grammatical Error Correction (GEC) models offer a promising solution to enhance English learning outcomes. This study evaluates the performance of four GEC models: T5 Mini, T5 Tiny, LLAMA 2, and ChatGPT 3.5-turbo, focusing on their ability to detect and correct grammatical errors accurately and provide relevant feedback. The results show that LLAMA 2 achieves the best performance with the highest GLUE score (0.565), demonstrating its superiority in formal grammar correction tasks. T5 Mini follows with a score of 0.524, offering a balance between accuracy and efficiency. T5 Tiny, scoring 0.518, is suitable for resource-constrained environments despite its lower accuracy. ChatGPT 3.5-turbo, while having the lowest GLUE score (0.491), excels in providing cohesive and relevant feedback in conversational contexts. This research provides insights into the strengths and weaknesses of each model, aiding in the selection of the best solution to support automated English grammar learning.

References

EF Education First, “EF English Proficiency Index 2018,” 2018, Retrieved from https://ef.com/assetscdn/WIBIwq6RdJvcD9bc8RMd/cefcom-epi-site/reports/2018/ef-epi-2018-english.pdf

EF Education First, “EF English Proficiency Index 2019,” 2019, Retrieved from https://www.ef.com/assetscdn/WIBIwq6RdJvcD9bc8RMd/cefcom-epi-site/reports/2019/ef-epi-2019-english.pdf

H. A. Z. S. Shahgir and K. S. Sayeed, “Bangla Grammatical Error Detection Using T5 Transformer Model,” 2023, arXiv. doi: 10.48550/ARXIV.2303.10612.

F. Ahsan, “Grammatical Error Correction with Transformer Models - Scribendi AI,” Scribendi AI, Feb. 04, 2021. https://www.scribendi.ai/grammatical-error-correction-with-transformer-models/ (accessed Jan. 3, 2025).

B. Kim, K. Lee, J. Kim, and S. Lee, “Small Language Models are Equation Reasoners,” 2024, arXiv. doi: 10.48550/ARXIV.2409.12393.

H. Touvron et al., “LLaMA: Open and Efficient Foundation Language Models,” 2023, arXiv. doi: 10.48550/ARXIV.2302.13971.

C. Napoles, K. Sakaguchi, and J. Tetreault, ‘JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction’, in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2017, pp. 229–234.

M. Naghshnejad, T. Joshi, and V. N. Nair, “Recent Trends in the Use of Deep Learning Models for Grammar Error Handling,” 2020, arXiv. doi: 10.48550/ARXIV.2009.02358.

Colin Raffel, undefined., et al. "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer," in J. Mach. Learn. Res., vol. 21, pp. 140:1-140:67, 2019.

J., Matias. “Grammatical Error Correction with byte-level language models (Master's thesis), ” Universitet I Oslo, 2023.

A. Vadehra and P. Poupart, “Detecting Errors to Improve Grammar Error Correction Models - Scribendi AI,” Scribendi AI, Jun. 29, 2023. https://www.scribendi.ai/detecting-errors-to-improve-grammar-error-correction-models/

M. Harahus et al., “Evaluation of Datasets Focused on Grammatical Error Correction Using the T5 Model in Slovak,” in 2024 34th International Conference Radioelektronika (RADIOELEKTRONIKA). IEEE, pp. 1–6, Apr. 17, 2024. doi: 10.1109/radioelektronika61599.2024.10524071.

A. Katinskaia and R. Yangarber, “Grammatical Error Correction for Sentence-level Assessment in Language Learning,” in Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023). Association for Computational Linguistics, pp. 488–502, 2023. doi: 10.18653/v1/2023.bea-1.41.

M. Qorib, H. Ng, "Grammatical Error Correction: Are We There Yet?," in Proceedings of the 29th International Conference on Computational Linguistics, 2022, pp. 2794–2800.

H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” 2023, arXiv. doi: 10.48550/ARXIV.2307.09288.

A. Luhtaru, T. Purason, M. Vainikko, M. Del, and M. Fishel, “To Err Is Human, but Llamas Can Learn It Too,” in Findings of the Association for Computational Linguistics: EMNLP 2024. Association for Computational Linguistics, pp. 12466–12481, 2024. doi: 10.18653/v1/2024.findings-emnlp.727.

T. M. Sahib et al., “A comparison between ChatGPT-3.5 and ChatGPT-4.0 as a tool for paraphrasing English Paragraphs”. In Int. Applied Social Sciences, pp. 471-480, 2023

H. Wu, W. Wang, Y. Wan, W. Jiao, and M. Lyu, “ChatGPT or Grammarly? Evaluating ChatGPT on Grammatical Error Correction Benchmark,” 2023, arXiv. doi: 10.48550/ARXIV.2303.13648.

C. Napoles, K. Sakaguchi, M. Post, and J. Tetreault, “Ground Truth for Grammaticality Correction Metrics,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, 2015. doi: 10.3115/v1/p15-2097.

Model’s evaluation based on GLEU Score

Downloads

Published

30-04-2025

How to Cite

Nurhasanah, J. A., Susantiningdyah, H., Saputra, I., Prayoga, I. R. A., & Utomo, M. C. C. (2025). The Grammar Correction: A Comparison of T5, LLAMA and ChatGPT. Innovative Informatics and Artificial Intelligence Research, 1(1), 26–34. https://doi.org/10.35718/iiair.v1i1.1308