Speaker Diarization based on Black-Hole Entropy Fuzzy Clustering using Cepstral Features
V. Subba Ramaiah1, S. Srinivasa Rao2, V.S.N.Kumar Devaraju3
1Dr. V. Subba Ramaiah*, CSE, Mahatma Gandhi Institute of Technology, JNTUH, Hyderabad, India.
2Dr. S. Srinivasa Rao, ECE, Mahatma Gandhi Institute of Technology, JNTUH, Hyderabad, India.
3V.S.N.Kumar Devaraju, ECE, Mahatma Gandhi Institute of Technology, JNTUH, Hyderabad, India.
Manuscript received on March 30, 2020. | Revised Manuscript received on April 05, 2020. | Manuscript published on April 30, 2020. | PP: 1055-1061 | Volume-9 Issue-4, April 2020. | Retrieval Number: D7832049420/2020©BEIESP | DOI: 10.35940/ijeat.D7832.049420
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Speaker diarization is the process of identification of the speaker in an audio sequence. This paper proposed a speaker diarization method using the Black-hole entropy fuzzy clustering and multiple kernel weighted Mel frequency cepstral coefficient (MKMFCC) parameterization. Initially, the MKMFCC descriptor extracted the cepstral features from the input audio signal. These features are used for clustering the speakers as groups for which the BHEFC is used. The feature parameter uses the audio signal containing both the high and low energy frame for speaker indexing that resulted in accurate separation of speaker. The performance evaluation of the proposed speaker diarization system is analyzed using the measures, such as F-measure, diarization error rate, and false alarm rate. The proposed MKMFCC with BHEFC obtained a minimum diarization error rate of 0.2447, maximum F-measure of 0.8526 and minimum false alarm rate of 0.4299, respectively while changing the wavelength and obtained a minimum diarization error rate of 0.2447, maximum F-measure of 0.8526 and minimum false alarm rate of 0.4298 when compared to the existing methods for the change in the frame length.
Keywords: Black-hole entropy fuzzy clustering, multiple kernel weighted Mel frequency cepstral coefficient, Speaker diarization.