An Enhanced BiLSTM-Based Model with Bidirectional Attention and Ant Colony Optimization for English NLP

Hai-Xia Xu

doi:10.46604/ijeti.2025.14481

Authors

Hai-Xia Xu School of Foreign Languages, Yancheng Institute of Technology, Yancheng, China

DOI:

https://doi.org/10.46604/ijeti.2025.14481

Keywords:

ant colony algorithm, deep learning, natural language processing, BiLSTM, BAM

Abstract

This study aims to overcome limitations in traditional natural language processing (NLP) models, particularly in network structure and hyperparameter tuning, which often hinder optimal performance across diverse tasks. To address these issues, the ant colony optimization (ACO) algorithm is introduced. This paper optimizes the layer count and other training hyperparameters of the Bidirectional Long Short-Term Memory (BiLSTM) network, enhancing both its flexibility and classification accuracy. To further enhance BiLSTM’s bidirectional selectivity, a bidirectional attention mechanism (BAM) is incorporated, strengthening the model’s capacity to integrate historical and future contextual information. The proposed ACO-BiLSTM-BAM model is validated on the Internet Movie Database (IMDb) movie review dataset, where it achieves a classification accuracy of 92.74%, marking a significant 12.05% improvement over the base BiLSTM model, particularly in discriminating sentiment at varied levels.

References

K. M. Hossen, M. N. Uddin, M. Arefin, and M. A. Uddin, “BERT Model-Based Natural Language to NoSQL Query Conversion Using Deep Learning Approach,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 2, article no. 0140293, 2023.

L. W. Astuti, Y. Sari, and Suprapto, “Code-Mixed Sentiment Analysis Using Transformer for Twitter Social Media Data,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 10, article no. 0141053, 2023.

H. Tian and J. Chen, “Deep Learning with Spatial Attention-Based CONV-LSTM for SOC Estimation of Lithium-Ion Batteries,” Processes, vol. 10, no. 11, article no. 2185, 2022.

T. Yang, H. Wang, S. Aziz, H. Jiang, and J. Peng, “A Novel Method of Wind Speed Prediction by Peephole LSTM,” International Conference on Power System Technology, pp. 364-369, 2018.

X. Hu, T. Liu, X. Hao, and C. Lin, “Attention-Based Conv-LSTM and Bi-LSTM Networks for Large-Scale Traffic Speed Prediction,” The Journal of Supercomputing, vol. 78, no. 10, pp. 12686-12709, 2022.

M. Alharthi and A. Mahmood, “Xlstmtime: Long-Term Time Series Forecasting with Xlstm,” AI, vol. 5, no. 3, pp. 1482-1495, 2024.

Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. V. Le, and R. Salakhutdinov, “Transformer-XL: Attentive Language Models beyond a Fixed-Length Context,” https://doi.org/10.48550/arXiv.1901.02860, 2019.

G. Zhao, J. Lin, Z. Zhang, X. Ren, Q. Su, and X. Sun, “Explicit Sparse Transformer: Concentrated Attention through Explicit Selection,” https://doi.org/10.48550/arXiv.1912.11637, 2019.

O. Galal, A. H. Abdel-Gawad, and M. Farouk, “Rethinking of BERT Sentence Embedding for Text Classification,” Neural Computing and Applications, vol. 36, no. 32, pp. 20245-20258, 2024.

Z. Liu, W. Lin, Y. Shi, and J. Zhao, “A Robustly Optimized BERT Pre-training Approach with Post-Training,” 20th China National Conference on Chinese Computational Linguistics, pp. 471-484, 2021.

Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, “XLNet: Generalized Autoregressive Pretraining for Language Understanding,” https://doi.org/10.48550/arXiv.1906.08237, 2020.

Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut. “ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations,” https://doi.org/10.48550/arXiv.1909.11942, 2020.

Q. L. Xia, “Medbert Model-based NLP Technology Standard in Intelligent Medical Assisted Decision Making,” Popular Standardization, no. 12, pp. 145-147, 2024. (In Chinese)

J. Muralitharan and C. Arumugam. “Privacy BERT-LSTM: A Novel NLP Algorithm for Sensitive Information Detection in Textual Documents,” Neural Computing and Applications, vol. 36, no. 25, pp. 15439-15454, 2024.

E. Alsuwat and H. Alsuwat. “An Improved Multi-Modal Framework for Fake News Detection Using NLP and Bi-LSTM,” The Journal of Supercomputing, vol. 81, no. 1, article no. 177, 2025.

N. Li and R. Kong, “Analysing Psychological Sentiment Prediction Across Modalities: Harnessing Emotion Datasets within Natural Language Processing (NLP),” ACM Transactions on Asian and Low-Resource Language Information Processing, article no. 3687305, 2024.

J. Zalte and H. Shah, “Contextual Classification of Clinical Records with Bidirectional Long Short-Term Memory (Bi-LSTM) and Bidirectional Encoder Representations from Transformers (BERT) Model,” Computational Intelligence, vol. 40, no. 4, article no. e12692, 2024.

S. Wan, H. Yang, J. Lin, J. Li, Y. Wang, and X. Chen, “Improved Whale Optimization Algorithm towards Precise State-of-Charge Estimation of Lithium-Ion Batteries via Optimizing LSTM,” Energy, vol. 310, article no. 133185, 2024.

S. J. Liu, M. X. Lu, C. F. Wang, Z. F. Zhao, and Y. Liu, “A Red Fuji Apple Appearance Grading Method Based on Improved Whale Optimization Algorithm and CNN,” Food and Machinery, no. 4, pp. 121-126, 2024. (In Chinese)

T. Zhang, Y. H. Gao, Y. J. Chen, J. B. Zhang, and H. T. Deng, “Ensemble Learning and Ant Colony Parameter Optimization of XGBoost for Face Retrieval and Applications,” Journal of China Academy of Electronics and Information Technology, vol. 18, no. 11, pp. 1021-1028, 2023. (In Chinese)

J. Li, Q. Liu, L. Li, J. H. Jin, and Y. Guo, “Fault Diagnosis Method of Boiler Heater Based on LSTM Deep Learning Model,” Industrial Heating, vol. 52, no. 9, pp.69-73,76, 2023. (In Chinese)

D. A. Andrade-Segarra and G. A. Le´on-Paredes, “Deep Learning-Based Natural Language Processing Methods Comparison for Presumptive Detection of Cyberbullying in Social Networks,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 5, article no. 0120592, 2021.

W. W. Chai, “Statistical Language Model-Based Analysis of English Corpora and Literature,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 9, article no. 0140995, 2023.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., “Attention Is All You Need,” https://doi.org/10.48550/arXiv.1706.03762, 2023.

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning Word Vectors for Sentiment Analysis,” Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142-150, 2011.