Knowledge Representation Strategies for Reducing Hallucinations in Retrieval-Augmented Domain-Specific Question Answering

Cheng-Hsiu Li

doi:10.46604/ijeti.2025.15708

Authors

Cheng-Hsiu Li Department of Information Management, National Taitung Junior College, Taitung, Taiwan, ROC

DOI:

https://doi.org/10.46604/ijeti.2025.15708

Keywords:

retrieval-augmented generation, knowledge representation, large language models, hallucination reduction, technical education

Abstract

To address the limitations of hallucinated responses in large language models (LLMs), an artificial intelligence (AI) chatbot featuring a retrieval-augmented generation system is designed to assist with subject-based certification instruction. Focusing on the Level B certification curriculum for computer hardware repair as an example, this study develops six distinct knowledge base structures (Type0–Type5) and integrates them into two open-source 7B-parameter LLMs (LLaMA2 and Qwen2) with a custom-built question and answer system. Response accuracy to 10 standardized questions is evaluated by domain experts. Knowledge structure significantly affects performance, with the enriched Type5 base yielding the highest accuracy (Qwen2: 98 points; LLaMA2: 73 points). Statistical tests confirm significant improvements with knowledge base enhancement across knowledge types and between models. These findings highlight the critical role of knowledge representation and LLM selection in domain-specific AI applications, proffering practical design guidelines for intelligent teaching assistants in technical education.

References

Workforce Development Agency, Skill Evaluation Center, “Agency Vision,” https://www.wdasec.gov.tw/cp.aspx?n=786A0E9AF937C82C, accessed in 2024.

L. S. Li, H. T. Li and S. C. Chen, “A Review and Outlook on the Certification System in Technical and Vocational Education,” Bimonthly Journal of Educational Data and Research, vol. 93, pp. 31-52, 2010. (In Chinese)

Workforce Development Agency, Skill Evaluation Center, “Test Reference Materials,” https://techbank.wdasec.gov.tw/owInform/TestReferData.aspx, accessed in 2024.

Workforce Development Agency, Skill Evaluation Center, “Regulations for Technician Skill Certification and Issuance,” https://www.wdasec.gov.tw/News_Content.aspx?n=4D833E26864BB926&sms=1BE761BDBCE7C913&s=91A7C6BD6D520027, accessed in 2024.

Workforce Development Agency, Skill Evaluation Center, “Statistics on Technician Skill Certification from 2011 to 2023: Registrations, Test Attendance, and Pass Rates,” https://www.wdasec.gov.tw/News.aspx?n=5941D5DCC3DD7DDA&sms=CA0630966F34DA45, accessed in 2024.

M. Arslan, H. Ghanem, S. Munawar, and C. Cruz, “A Survey on RAG with LLMs,” Procedia Computer Science, vol. 246, pp. 3781-3790, 2024.

G. Budakoglu and H. Emekci, “Unveiling the Power of Large Language Models: A Comparative Study of Retrieval-Augmented Generation, Fine-Tuning, and Their Synergistic Fusion for Enhanced Performance,” IEEE Access, vol. 13, pp. 30936-30951, 2025.

R. Yang, Y. Ning, E. Keppo, M. Liu, C. Hong, D. S. Bitterman, et al., “Retrieval-Augmented Generation for Generative Artificial Intelligence in Health Care,” NPJ Health Systems, vol. 2, article no. 2, 2025.

V. Magesh, F. Surani, M. Dahl, M. Suzgun, C. D. Manning, and D. E. Ho, “Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools,” Journal of Empirical Legal Studies, vol. 22, no. 2, pp. 216-242, 2025.

A. Bora and H. Cuayáhuitl, “Systematic Analysis of Retrieval-Augmented Generation-Based LLMs for Medical Chatbot Applications,” Machine Learning and Knowledge Extraction, vol. 6, no. 4, pp. 2355-2374, 2024.

S. Wang, W. Fan, Y. Feng, S. Lin, X. Ma, S. Wang, et al., “Knowledge Graph Retrieval-Augmented Generation for LLM-based Recommendation,” Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 27152-27168, 2025.

M. Wang, C. Wang, J. Chen, B. Wang, W. Wang, X. Ma, et al., “A Lightweight Knowledge Graph-Driven Question Answering System for Field-Based Mineral Resource Survey,” Applied Computing and Geosciences, vol. 27, article no. 100268, 2025.

S. Knollmeyer, O. Caymazer, and D. Grossmann, “Document GraphRAG: Knowledge Graph Enhanced Retrieval Augmented Generation for Document Question Answering Within the Manufacturing Domain,” Electronics, vol. 14, no. 11, article no. 2102, 2025.

J. Nie, X. Liu, Y. Tu, G. Zhao, D. Wu, and Y. Tang, “Design and Implementation of a Medical Question Answering System Based on Retrieval-Augmented Generation,” International Journal of Biology and Life Sciences, vol. 11, no. 3, pp. 1-6, 2025.

C. Woesle, L. Fischer-Brandies, and R. Buettner, “A Systematic Literature Review of Hallucinations in Large Language Models,” IEEE Access, vol. 13, pp. 148231-148253, 2025.

W. Zhang and J. Zhang, “Hallucination Mitigation for Retrieval-Augmented Large Language Models: A Review,” Mathematics, vol. 13, no. 5, article no. 856, 2025.

M. A. M. Abdelghafour, M. Mabrouk, and Z. Taha, “Hallucination Mitigation Techniques in Large Language Models,” International Journal of Intelligent Computing and Information Sciences, vol. 24, no. 4, pp. 73-81, 2024.