A Wireless Sensor Network-Speech Recognition Scheme Using Deployments of Multiple Kinect Microphone Array-Sensors
Speech recognition has successfully been utilized in lots of applications recently. With the development of the Kinect sensor device from Microsoft, speech recognition could be further promoted to be used in an ubiquitous environment where a wireless sensor network using Kinect sensors is deployed. This study develops a wireless sensor network (WSN)-speech recognition scheme using deployments of multiple Kinect microphone-array sensors. Presented speech recognition by Kinect-WSN could effectively capture the acoustic data made from the talking speaker and then perform the corresponding voice command control on certain target. In this study, different strategies to deploy multiple Kinect microphone-array sensors for constructing an ubiquitous Kinect-WSN speech recognition environment are investigated. Several different acoustic sensing data fusion methods are also explored for achieving superior performance on Kinect-WSN speech recognition. The presented method in this paper is evaluated the efficiency and effectiveness in an 5m×5m laboratory environment in which any of four test speakers is to make the voice command anywhere. Developed Kinect microphone array sensor-deployed WSN speech recognition in this work is finely utilized in various different applications in control.
I. J. Ding, C. T. Yen and D. C. Ou, “A method to integrate GMM, SVM and DTW for speaker recognition,” International Journal of Engineering and Technology Innovation, vol. 4, no. 1, pp. 38-47, 2014.
I. J. Ding and Y. M. Hsu, “An HMM-like dynamic time warping scheme for automatic speech recognition,” Mathematical Problems in Engineering, vol. 2014, Article ID 898729, 8 pages, 2014.
I. Tashev, “Kinect development kit: a toolkit for gesture- and speech based human-machine interaction,” IEEE Signal Processing Magazine, vol. 30, no. 5, pp. 129–131, 2013.
Z. Zhang, “Microsoft kinect sensor and its effect,” IEEE Multimedia, vol. 19, no. 2, pp. 4–10, 2012.
I. J. Ding and C. W. Chang, “An eigenspace-based method with a user adaptation scheme for human gesture recognition by using Kinect 3D data,” Applied Mathematical Modelling, vol. 39, no. 19, pp. 5769-5777, 2015.
I. J. Ding and C. W. Chang, “Feature design scheme for Kinect-based DTW human gesture recognition,” Multimedia Tools and Applications, pp. 1-16, July, 2015
K. Qian, J. Niu and H. Yang, “Developing a gesture based remote human-robot interaction system using Kinect,” International Journal of Smart Home, vol. 7, no. 4, pp. 203–208, 2013.
K. Kumatani, T. Arakawa, K. Yamamoto, J. McDonough, B. Raj, R. Singh and I. Tashev, “Microphone array processing for distant speech recognition: towards real-world deployment,” Proc. Asia-Pacific Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012.
Submission of a manuscript implies: that the work described has not been published before that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication. Authors can retain copyright of their article with no restrictions. Also, author can post the final, peer-reviewed manuscript version (postprint) to any repository or website.
Since Oct. 01, 2015, PETI will publish new articles with Creative Commons Attribution Non-Commercial License, under The Creative Commons Attribution Non-Commercial 4.0 International (CC BY-NC 4.0) License.
The Creative Commons Attribution Non-Commercial (CC-BY-NC) License permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes