A REVIEW ON MULTIMODAL SPEAKER RECOGNITION

Khadar Nawas K

doi:10.22159/ajpcr.2017.v10s1.19761

Authors

Khadar Nawas K School of Computing Science and Engineering, VIT University, Chennai Campus, Tamil Nadu, India.

DOI:

https://doi.org/10.22159/ajpcr.2017.v10s1.19761

Keywords:

Speaker Recognition, Multimodal Speaker recognition, throat microphone, bone microphone, VQ, GMM

Abstract

A review on multimodal speaker recognition (SR) is being presented. For many decades the speaker recognition has been studied and still it has grabbed the interest of many researchers. Speaker recognition includes of two levels â€“system training and system testing. The robustness of the speaker recognition system depends on the training environment and testing environment as well asÂ the quality ofÂ speech .Air conducted (AC) Speech is a source fromÂ which speaker is recognized by extracting the features. The performance of the speaker recognition system depends on AC speech. further to improve the robustnessÂ and accuracy ofÂ the SR system various other sources(Modals) like Throat Microphone ,Bone Conduction Microphone, array of microphones,Non Audible murmur, non auditory information like video are used in complementary with standard AC microphone. This paper is purely a review on SR and various complimentary modals.

Downloads

Download data is not yet available.

References

Joseph P. Campbell, jr., senior member, IEEE Speaker Recognition: A Tutorialâ€ , proceedings of the IEEE, vol. 85, no. 9, september 1997.

Marcos Faundez-Zanuy,Enric Monte-Moreno,â€State-of-the-Art in Speaker Recognitionâ€,, IEEE abre systems magazine, may 2005.

Mubeen, N., Shahina, a., Khan, a. N., & Vinoth, G. (2012). Combining spectral features of standard and throat microphones for speaker identification. International Conference on Recent Trends in Information Technology, ICRTIT 2012, 119â€“122.

Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52(1), 12â€“40.

Sapijaszko, G. I., & Mikhael, W. B. (2012). An overview of recent window based feature extraction algorithms for speaker recognition. Midwest Symposium on Circuits and Systems, 880â€“883.

Ramachandran, R. P., Farrell, K. R., Ramachandran, R., & Mammone, R. J. (2002). Speaker recognitionâ€”general classifier approaches and data fusion methods. Pattern Recognition, 35, 2801â€“2821.

Rahman, M. S., & Shimamura, T. (n.d.). A Study on Amplitude Variation of Bone Conducted Speech Compared to Air Conducted Speech.

McBride, M., Tran, P., Letowski, T., & Patrick, R. (2011). The effect of bone conduction microphone locations on speech intelligibility and sound quality. Applied Ergonomics, 42(3), 495â€“502.

Srinivasan and Patrick Kechichian, I.Sriram,â€Enhancement, a. s. (2012). robustness analysis of speech enhancement using a bone conduction microphoneâ€ â€“ preliminary results (September), 4â€“6.

Tran, P., Letowski, T., & McBride, M. (2008). Bone conduction microphone: Head sensitivity mapping for speech intelligibility and sound quality. ICALIP 2008 - 2008 International Conference on Audio, Language and Image Processing, Proceedings, 107â€“111.

Tsuge, S., Koizumi, D., Fukumi, M., & Kuroiwa, S. (2009). Speaker verification method using bone-conduction and air-conduction speech. ISPACS 2009 - 2009 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings, (Ispacs), 449â€“452.

Yamasaki, N., & Shimamura, T. (2010). Accuracy Improvement of Speaker Authentication in Noisy Environments Using Bone-Conducted Speech, 197â€“200.

Weng, Z., Li, L., & Guo, D. (2010). Speaker recognition using weighted dynamic MFCC based on GMM. Proceedings - 2010 International Conference on Anti-Counterfeiting, Security and Identification, 2010 ASID, 285â€“288. doi:10.1109/ICASID.2010.5551341

R.M Gray, Vector Quantization,â€IEEE ASSP Magazine, pp. 4-29, April 1984

A. Likas, Vlassis and J. J. Verbeek, The global k-means clustering algorithm,â€ in Pattern Recognition , vol. 36, no. 2, pp. 451-461

S. S. Khan and A. Ahmed, Cluster center initialization for K-means algorithm,â€ in Pattern Recognition Letters, vol. 25, no. 11

Cherifa S. and Messaoud R,New Technique to use the GMM in Speaker Recognition System (SRS)â€, International Conference on Computer Applications Technology, pp. 1-5,2013.

A REVIEW ON MULTIMODAL SPEAKER RECOGNITION

Authors

DOI:

Keywords:

Abstract

Downloads

References

Published

How to Cite

Issue

Section