Knowledge Representation Strategies for Reducing Hallucinations in Retrieval-Augmented Domain-Specific Question Answering
DOI:
https://doi.org/10.46604/ijeti.2025.15708Keywords:
retrieval-augmented generation, knowledge representation, large language models, hallucination reduction, technical educationAbstract
To address the limitations of hallucinated responses in large language models (LLMs), an artificial intelligence (AI) chatbot featuring a retrieval-augmented generation system is designed to assist with subject-based certification instruction. Focusing on the Level B certification curriculum for computer hardware repair as an example, this study develops six distinct knowledge base structures (Type0–Type5) and integrates them into two open-source 7B-parameter LLMs (LLaMA2 and Qwen2) with a custom-built question and answer system. Response accuracy to 10 standardized questions is evaluated by domain experts. Knowledge structure significantly affects performance, with the enriched Type5 base yielding the highest accuracy (Qwen2: 98 points; LLaMA2: 73 points). Statistical tests confirm significant improvements with knowledge base enhancement across knowledge types and between models. These findings highlight the critical role of knowledge representation and LLM selection in domain-specific AI applications, proffering practical design guidelines for intelligent teaching assistants in technical education.
References
Workforce Development Agency, Skill Evaluation Center, “Agency Vision,” https://www.wdasec.gov.tw/cp.aspx?n=786A0E9AF937C82C, accessed in 2024.
L. S. Li, H. T. Li and S. C. Chen, “A Review and Outlook on the Certification System in Technical and Vocational Education,” Bimonthly Journal of Educational Data and Research, vol. 93, pp. 31-52, 2010. (In Chinese)
Workforce Development Agency, Skill Evaluation Center, “Test Reference Materials,” https://techbank.wdasec.gov.tw/owInform/TestReferData.aspx, accessed in 2024.
Workforce Development Agency, Skill Evaluation Center, “Regulations for Technician Skill Certification and Issuance,” https://www.wdasec.gov.tw/News_Content.aspx?n=4D833E26864BB926&sms=1BE761BDBCE7C913&s=91A7C6BD6D520027, accessed in 2024.
Workforce Development Agency, Skill Evaluation Center, “Statistics on Technician Skill Certification from 2011 to 2023: Registrations, Test Attendance, and Pass Rates,” https://www.wdasec.gov.tw/News.aspx?n=5941D5DCC3DD7DDA&sms=CA0630966F34DA45, accessed in 2024.
M. Arslan, H. Ghanem, S. Munawar, and C. Cruz, “A Survey on RAG with LLMs,” Procedia Computer Science, vol. 246, pp. 3781-3790, 2024.
G. Budakoglu and H. Emekci, “Unveiling the Power of Large Language Models: A Comparative Study of Retrieval-Augmented Generation, Fine-Tuning, and Their Synergistic Fusion for Enhanced Performance,” IEEE Access, vol. 13, pp. 30936-30951, 2025.
R. Yang, Y. Ning, E. Keppo, M. Liu, C. Hong, D. S. Bitterman, et al., “Retrieval-Augmented Generation for Generative Artificial Intelligence in Health Care,” NPJ Health Systems, vol. 2, article no. 2, 2025.
V. Magesh, F. Surani, M. Dahl, M. Suzgun, C. D. Manning, and D. E. Ho, “Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools,” Journal of Empirical Legal Studies, vol. 22, no. 2, pp. 216-242, 2025.
A. Bora and H. Cuayáhuitl, “Systematic Analysis of Retrieval-Augmented Generation-Based LLMs for Medical Chatbot Applications,” Machine Learning and Knowledge Extraction, vol. 6, no. 4, pp. 2355-2374, 2024.
S. Wang, W. Fan, Y. Feng, S. Lin, X. Ma, S. Wang, et al., “Knowledge Graph Retrieval-Augmented Generation for LLM-based Recommendation,” Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 27152-27168, 2025.
M. Wang, C. Wang, J. Chen, B. Wang, W. Wang, X. Ma, et al., “A Lightweight Knowledge Graph-Driven Question Answering System for Field-Based Mineral Resource Survey,” Applied Computing and Geosciences, vol. 27, article no. 100268, 2025.
S. Knollmeyer, O. Caymazer, and D. Grossmann, “Document GraphRAG: Knowledge Graph Enhanced Retrieval Augmented Generation for Document Question Answering Within the Manufacturing Domain,” Electronics, vol. 14, no. 11, article no. 2102, 2025.
J. Nie, X. Liu, Y. Tu, G. Zhao, D. Wu, and Y. Tang, “Design and Implementation of a Medical Question Answering System Based on Retrieval-Augmented Generation,” International Journal of Biology and Life Sciences, vol. 11, no. 3, pp. 1-6, 2025.
C. Woesle, L. Fischer-Brandies, and R. Buettner, “A Systematic Literature Review of Hallucinations in Large Language Models,” IEEE Access, vol. 13, pp. 148231-148253, 2025.
W. Zhang and J. Zhang, “Hallucination Mitigation for Retrieval-Augmented Large Language Models: A Review,” Mathematics, vol. 13, no. 5, article no. 856, 2025.
M. A. M. Abdelghafour, M. Mabrouk, and Z. Taha, “Hallucination Mitigation Techniques in Large Language Models,” International Journal of Intelligent Computing and Information Sciences, vol. 24, no. 4, pp. 73-81, 2024.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Cheng-Hsiu Li

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright Notice
Submission of a manuscript implies: that the work described has not been published before that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication. Authors can retain copyright in their articles with no restrictions. Also, author can post the final, peer-reviewed manuscript version (postprint) to any repository or website.

Since Jan. 01, 2019, IJETI will publish new articles with Creative Commons Attribution Non-Commercial License, under Creative Commons Attribution Non-Commercial 4.0 International (CC BY-NC 4.0) License.
The Creative Commons Attribution Non-Commercial (CC-BY-NC) License permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.


.jpg)
