Detection of GenAI-produced and student-written C# code: A comparative study of classifier algorithms and code stylometry features
DOI:
https://doi.org/10.23962/ajic.i35.21309Keywords:
C# code, generative AI (GenAI) code, student-written code, machine-learning, code classification, code stylometry featuresAbstract
The prevalence of students using generative artificial intelligence (GenAI) to produce program code is such that certain courses are rendered ineffective because students can avoid learning the required skills. Meanwhile, detecting GenAI code and differentiating between GenAI-produced and human-written code are becoming increasingly challenging. This study tested the ability of six classifier algorithms to detect GenAI C# code and to distinguish it from C# code written by students at a South African university. A large dataset of verified student-written code was collated from first-year students at South Africa’s University of the Free State, and corresponding GenAI code produced by Blackbox.AI, ChatGPT and Microsoft Copilot was generated and collated. Code metric features were extracted using modified Roslyn APIs. The data was organised into four sets with an equal number of student-written and AI-generated code, and a machine- learning model was deployed with the four sets using six classifiers: extreme gradient boosting (XGBoost), k-nearest neighbors (KNN), support vector machine (SVM), AdaBoost, random forest, and soft voting (with XGBoost, KNN and SVM as inputs). It was found that the GenAI C# code produced by Blackbox.AI, ChatGPT, and Copilot could, with a high degree of accuracy, be identified and distinguished from student-written C# code through use of the classifier algorithms, with XGBoost performing strongest in detecting GenAI code and random forest performing best in identification of student-written code.
References
Benzebouchi, N. E., Azizi, N., Hammami, N. E., Schwab, D., Khelaifia, M. C. E., & Aldwairi, M. (2019). Authors’ writing styles based authorship identification system using the text representation vector. In 16th International Multi- Conference on Systems, Signals and Devices (SSD 2019) (pp. 371–376). https://doi.org/10.1109/SSD.2019.8894872
Bukhari, S., Tan, B., & De Carli, L. (2023). Distinguishing AI- and human-generated code: A case study. In SCORED 2023 – Proceedings of the 2023 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses (pp. 17–25). https://doi.org/10.1145/3605770.3625215
Caliskan, A., Yamaguchi, F., Dauber, E., Harang, R., Rieck, K., Greenstadt, R., & Narayanan, A. (2018). When coding style survives compilation: De-anonymizing programmers from executable binaries. In 25th Annual Network and Distributed System Security Symposium (NDSS 2018). https://doi.org/10.14722/ndss.2018.23304
Cao, Y., Li, S., Liu, Y., Yan, Z., Dai, Y., Yu, P. S., & Sun, L. (2023). A comprehensive survey of AI-generated content (AIGC): A history of generative AI from GAN to ChatGPT. Journal of the ACM, 37(4). http://arxiv.org/abs/2303.04226
Cheers, H., Lin, Y., & Smith, S. P. (2021). Academic source code plagiarism detection by measuring program behavioral similarity. IEEE Access, 9, 50391–50412. https://doi.org/10.1109/ACCESS.2021.3069367
Cheers, H., Lin, Y., & Yan, W. (2023). Identifying plagiarised programming assignments with detection tool consensus. Informatics in Education, 22(1), 1–19. https://doi.org/10.15388/infedu.2023.05
Corso, V., Mariani, L., Micucci, D., & Riganelli, O. (2024). Generating Java methods: An empirical assessment of four AI-based code assistants. In Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension (ICPC 2024). https://doi.org/10.1145/3643916.3644402
Czibula, G., Lupea, M., & Briciu, A. (2022). Enhancing the performance of software authorship attribution using an ensemble of deep autoencoders. Mathematics, 10(15). https://doi.org/10.3390/math10152572
Dehaerne, E., Dey, B., Halder, S., De Gendt, S., & Meert, W. (2022). Code generation using machine learning: A systematic review. IEEE Access, 10(July), 82434–82455. https://doi.org/10.1109/ACCESS.2022.3196347
Ding, S. H. H., Fung, B. C. M., Iqbal, F., & Cheung, W. K. (2019). Learning stylometric representations for authorship analysis. IEEE Transactions on Cybernetics, 49(1), 107–121. https://doi.org/10.1109/TCYB.2017.2766189
Ebrahim, F., & Joy, M. (2023). Source code plagiarism detection with pre-trained model embeddings and automated machine learning. In International Conference Recent Advances in Natural Language Processing (RANLP) (pp. 301–309). https://doi.org/10.26615/978-954-452-092-2_034
Eliwa, E., Essam, S., Ashraf, M., & Sayed, A. (2023). Automated detection approaches for source code plagiarism in students’ submissions. Journal of Computing and Communication, 2(2), 8–18. https://doi.org/10.21608/jocc.2023.307054
Ghosal, S. S., Chakraborty, S., Geiping, J., Huang, F., Manocha, D., & Bedi, A. S. (2023). Towards possibilities and impossibilities of AI-generated text detection: A survey. arXiv preprint. https://doi.org/10.48550/arXiv.2310.15264
Idialu, O. J., Mathews, N. S., Maipradit, R., Atlee, J. M., & Nagappan, M. (2024). Whodunit: Classifying code as human authored or GPT-4 generated – A case study on CodeChef problems. https://doi.org/10.1145/3643991.3644926
Kalgutkar, V., Kaur, R., Gonzalez, H., Stakhanova, N., & Matyukhina, A. (2019). Code authorship attribution: Methods and challenges. ACM Computing Surveys, 52(1). https://doi.org/10.1145/3292577
Kazemitabaar, M., Ye, R., Wang, X., Henley, A. Z., Denny, P., Craig, M., & Grossman, T. (2024). CodeAid: Evaluating a classroom deployment of an LLM-based programming assistant that balances student and educator needs. In CHI ’24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3613904.3642773
Kotsiantis, S., Verykios, V., & Tzagarakis, M. (2024). AI-assisted programming tasks using code embeddings and transformers. Electronics, 13(4), 1–25. https://doi.org/10.3390/electronics13040767
Krasniqi, R., & Do, H. (2023). Towards semantically enhanced detection of emerging quality-related concerns in source code. Software Quality Journal, 31(3), 865–915. https://doi.org/10.1007/s11219-023-09614-8
Kuhail A. M., Mathew, S. S., Khalil, A., Berengueres, J., Jawad, S., & Shah, H. (2024). “Will I be replaced?” Assessing ChatGPT’s effect on software development and programmer perceptions of AI tools. Science of Computer Programming, 235, 103111. https://doi.org/10.1016/j.scico.2024.103111
Lalitha, L. V. K., Sree, V., Lekha, R. S., & Kumar, V. N. (2021). Plagiat: A code plagiarism detection tool. EPRA International Journal of Research and Development (IJRD), 7838, 97–101.
Li, Z., Jiang, Y., Zhang, X. J., & Xu, H. Y. (2020). The metric for automatic code generation. Procedia Computer Science, 166, 279–286. https://doi.org/10.1016/j.procs.2020.02.099
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30 (NIPS 2017). https://arxiv.org/abs/1705.07874
Makridakis, S. (2017). The forthcoming Artificial Intelligence (AI) revolution: Its impact on society and firms. Futures, 90, 46–60. https://doi.org/10.1016/j.futures.2017.03.006
Maryono, D., Yuana, R. A., & Hatta, P. (2019). The analysis of source code plagiarism in basic programming course. Journal of Physics: Conference Series, 1193(1). https://doi.org/10.1088/1742-6596/1193/1/012027
Nghiem, K., Nguyen, A. M., & Bui, N. D. Q. (2024). Envisioning the next-generation AI coding assistants: Insights and proposals. In 2024 First IDE Workshop (IDE ’24). https://doi.org/10.1145/3643796.3648467
Odeh, A., Odeh, N., & Mohammed, A. S. (2024). A comparative review of AI techniques for automated code generation in software development: Advancements, challenges, and future directions. TEM Journal, 13(1), 726–739. https://doi.org/10.18421/tem131-76
Pan, W. H., Chok, M. J., Wong, J. L. S., Shin, Y. X., Poon, Y. S., Yang, Z., Chong, C. Y., Lo, D., & Lim, M. K. (2024). Assessing AI detectors in identifying AI-generated code: Implications for education. https://arxiv.org/abs/2401.03676
Portillo-Dominguez, A. O, Ayala-Rivera, V., Murphy, E., & Murphy, J. (2017). A unified approach to automate the usage of plagiarism detection tools in programming courses. In ICCSE 2017 – 12th International Conference on Computer Science and Education, ICCSE, 18–23. https://doi.org/10.1109/ICCSE.2017.8085456
Raiaan, M. A. K., Mukta, M. S. H., Fatema, K., Fahad, N. M., Sakib, S., Mim, M. M. J., Ahmad, J., Ali, M. E., & Azam, S. (2024). A review on large language models: Architectures, applications, taxonomies, open issues and challenges. IEEE Access, 12(February), 26839–26874. https://doi.org/10.1109/ACCESS.2024.3365742
ShaukatTamboli, M., & Prasad, R. (2013). Authorship analysis and identification techniques: A review. International Journal of Computer Applications, 77(16), 11–15. https://doi.org/10.5120/13566-1375
Song, X., Sun, H., Wang, X., & Yan, J. (2019). A survey of automatic generation of source code comments: Algorithms and techniques. IEEE Access, 7, 111411–111428. https://doi.org/10.1109/ACCESS.2019.2931579
Srivastava, S., Rai, A., & Varshney, M. (2021). A tool to detect plagiarism in java source code. Lecture Notes in Networks and Systems, 145, 243–253. https://doi.org/10.1007/978-981-15-7345-3_20
Tereszkowski-Kaminski, M., Pastrana, S., Blasco, J., & Suarez-Tangil, G. (2022). Towards improving code stylometry analysis in underground forums. In Proceedings on Privacy Enhancing Technologies, 2022(1), 126–147. https://doi.org/10.2478/popets-2022-0007
Varona, D., & Suárez, J. L. (2022). Discrimination, bias, fairness, and trustworthy AI. Applied Sciences, 12(12). https://doi.org/10.3390/app12125826
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Illia, P. (2017). Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems (pp. 5998–6008). https://doi.org/10.48550/arXiv.1706.03762
Wan, Y., He, Y., Bi, Z., Zhang, J., Zhang, H., Sui, Y., Xu, G., Jin, H., & Yu, P. S. (2023). Deep learning for code intelligence: Survey, benchmark and toolkit. Arxiv.Org, 1(1), 771–783. https://arxiv.org/abs/2401.00288
White, J., Hays, S., Fu, Q., Spencer-Smith, J., & Schmidt, D. C. (2023). ChatGPT prompt patterns for improving code quality, refactoring, requirements elicitation, and software design. https://doi.org/10.1007/978-3-031-55642-5_4
Zafar, S., Sarwar, M. U., Salem, S., & Malik, M. Z. (2020). Language and obfuscation oblivious source code authorship attribution. IEEE Access, 8, 197581–197596. https://doi.org/10.1109/ACCESS.2020.3034932
Zhang, H., Cruz, L., & van Deursen, A. (2022). Code smells for machine learning applications. In Proceedings – 1st International Conference on AI Engineering – Software Engineering for AI (CAIN) 2022 (pp. 217–228). https://doi.org/10.1145/3522664.3528620
Zheng, M., Pan, X., & Lillis, D. (2018). CodEX: Source code plagiarism detection based on abstract syntax trees. CEUR Workshop Proceedings, 2259, 362–373.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Adewuyi Adetayo Adegbite, Eduan Kotzé

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
- Abstract 7
- PDF 5