A Wireless Sensor Network-Speech Recognition Scheme Using Deployments of Multiple Kinect Microphone Array-Sensors

  • Ing-Jr Ding
  • Shih-Kai Lin


Speech recognition has successfully been utilized in lots of applications recently. With the development of the Kinect sensor device from Microsoft, speech recognition could be further promoted to be used in an ubiquitous environment where a wireless sensor network using Kinect sensors is deployed. This study develops a wireless sensor network (WSN)-speech recognition scheme using deployments of multiple Kinect microphone-array sensors. Presented speech recognition by Kinect-WSN could effectively capture the acoustic data made from the talking speaker and then perform the corresponding voice command control on certain target. In this study, different strategies to deploy multiple Kinect microphone-array sensors for constructing an ubiquitous Kinect-WSN speech recognition environment are investigated. Several different acoustic sensing data fusion methods are also explored for achieving superior performance on Kinect-WSN speech recognition. The presented method in this paper is evaluated the efficiency and effectiveness in an 5m×5m laboratory environment in which any of four test speakers is to make the voice command anywhere. Developed Kinect microphone array sensor-deployed WSN speech recognition in this work is finely utilized in various different applications in control.


I. J. Ding, C. T. Yen and D. C. Ou, “A method to integrate GMM, SVM and DTW for speaker recognition,” International Journal of Engineering and Technology Innovation, vol. 4, no. 1, pp. 38-47, 2014.

I. J. Ding and Y. M. Hsu, “An HMM-like dynamic time warping scheme for automatic speech recognition,” Mathematical Problems in Engineering, vol. 2014, Article ID 898729, 8 pages, 2014.

I. Tashev, “Kinect development kit: a toolkit for gesture- and speech based human-machine interaction,” IEEE Signal Processing Magazine, vol. 30, no. 5, pp. 129–131, 2013.

Z. Zhang, “Microsoft kinect sensor and its effect,” IEEE Multimedia, vol. 19, no. 2, pp. 4–10, 2012.

I. J. Ding and C. W. Chang, “An eigenspace-based method with a user adaptation scheme for human gesture recognition by using Kinect 3D data,” Applied Mathematical Modelling, vol. 39, no. 19, pp. 5769-5777, 2015.

I. J. Ding and C. W. Chang, “Feature design scheme for Kinect-based DTW human gesture recognition,” Multimedia Tools and Applications, pp. 1-16, July, 2015

K. Qian, J. Niu and H. Yang, “Developing a gesture based remote human-robot interaction system using Kinect,” International Journal of Smart Home, vol. 7, no. 4, pp. 203–208, 2013.

K. Kumatani, T. Arakawa, K. Yamamoto, J. McDonough, B. Raj, R. Singh and I. Tashev, “Microphone array processing for distant speech recognition: towards real-world deployment,” Proc. Asia-Pacific Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012.

How to Cite
Ding, I.-J., & Lin, S.-K. (2016). A Wireless Sensor Network-Speech Recognition Scheme Using Deployments of Multiple Kinect Microphone Array-Sensors. Proceedings of Engineering and Technology Innovation, 3, 25-27. Retrieved from http://ojs.imeti.org/index.php/PETI/article/view/237