A REVIEW ON MULTIMODAL SPEAKER RECOGNITION


Khadar Nawas K

Abstract


A review on multimodal speaker recognition (SR) is being presented. For many decades the speaker recognition has been studied and still it has grabbed the interest of many researchers. Speaker recognition includes of two levels –system training and system testing. The robustness of the speaker recognition system depends on the training environment and testing environment as well as  the quality of  speech .Air conducted (AC) Speech is a source from  which speaker is recognized by extracting the features. The performance of the speaker recognition system depends on AC speech. further to improve the robustness  and accuracy of  the SR system various other sources(Modals) like Throat Microphone ,Bone Conduction Microphone, array of microphones,Non Audible murmur, non auditory information like video are used in complementary with standard AC microphone. This paper is purely a review on SR and various complimentary modals.


Keywords


Speaker Recognition; Multimodal Speaker recognition; throat microphone; bone microphone;VQ, GMM

| PDF |

References


Joseph P. Campbell, jr., senior member, IEEE “Speaker Recognition: A Tutorial” , proceedings of the IEEE, vol. 85, no. 9, september 1997.

Marcos Faundez-Zanuy,Enric Monte-Moreno,”State-of-the-Art in Speaker Recognition”,, IEEE abre systems magazine, may 2005.

Mubeen, N., Shahina, a., Khan, a. N., & Vinoth, G. (2012). Combining spectral features of standard and throat microphones for speaker identification. International Conference on Recent Trends in Information Technology, ICRTIT 2012, 119–122.

Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52(1), 12–40.

Sapijaszko, G. I., & Mikhael, W. B. (2012). An overview of recent window based feature extraction algorithms for speaker recognition. Midwest Symposium on Circuits and Systems, 880–883.

Ramachandran, R. P., Farrell, K. R., Ramachandran, R., & Mammone, R. J. (2002). Speaker recognition—general classifier approaches and data fusion methods. Pattern Recognition, 35, 2801–2821.

Rahman, M. S., & Shimamura, T. (n.d.). A Study on Amplitude Variation of Bone Conducted Speech Compared to Air Conducted Speech.

McBride, M., Tran, P., Letowski, T., & Patrick, R. (2011). The effect of bone conduction microphone locations on speech intelligibility and sound quality. Applied Ergonomics, 42(3), 495–502.

Srinivasan and Patrick Kechichian, I.Sriram,”Enhancement, a. s. (2012). robustness analysis of speech enhancement using a bone conduction microphone” – preliminary results (September), 4–6.

Tran, P., Letowski, T., & McBride, M. (2008). Bone conduction microphone: Head sensitivity mapping for speech intelligibility and sound quality. ICALIP 2008 - 2008 International Conference on Audio, Language and Image Processing, Proceedings, 107–111.

Tsuge, S., Koizumi, D., Fukumi, M., & Kuroiwa, S. (2009). Speaker verification method using bone-conduction and air-conduction speech. ISPACS 2009 - 2009 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings, (Ispacs), 449–452.

Yamasaki, N., & Shimamura, T. (2010). Accuracy Improvement of Speaker Authentication in Noisy Environments Using Bone-Conducted Speech, 197–200.

Weng, Z., Li, L., & Guo, D. (2010). Speaker recognition using weighted dynamic MFCC based on GMM. Proceedings - 2010 International Conference on Anti-Counterfeiting, Security and Identification, 2010 ASID, 285–288. doi:10.1109/ICASID.2010.5551341

R.M Gray, “Vector Quantization,”IEEE ASSP Magazine, pp. 4-29, April 1984

A. Likas, Vlassis and J. J. Verbeek, “The global k-means clustering algorithm,” in Pattern Recognition , vol. 36, no. 2, pp. 451-461

S. S. Khan and A. Ahmed, “Cluster center initialization for K-means algorithm,” in Pattern Recognition Letters, vol. 25, no. 11

Cherifa S. and Messaoud R,“New Technique to use the GMM in Speaker Recognition System (SRS)”, International Conference on Computer Applications Technology, pp. 1-5,2013.




About this article

Title

A REVIEW ON MULTIMODAL SPEAKER RECOGNITION

Keywords

Speaker Recognition; Multimodal Speaker recognition; throat microphone; bone microphone;VQ, GMM

DOI

10.22159/ajpcr.2017.v10s1.19761

Date

01-04-2017

Additional Links

Manuscript Submission

Journal

Asian Journal of Pharmaceutical and Clinical Research
Special Issue April 2017 Page: 382-384

Print ISSN

0974-2441

Online ISSN

2455-3891

Authors & Affiliations

Khadar Nawas K
School of Computing Science and Engineering, VIT University, Chennai Campus, Tamil Nadu, India.
India


Article Tools


Email this article (Login required)
Email the author (Login required)

Refbacks

  • There are currently no refbacks.