Abstract:
Gender detection from human behavior is a complex problem for digital technology studies. A
collection of methods has been used to identify pertinent elements that can be used to create a
model from a training set to classify gender from a voice signal. In this study, the Mel
Frequency Cepstral Coefficient was used to distinguish between male and female voices in
Bengali. 120 data of different voices were taken in wave (.wav) format and the audio length
was five seconds for each data. For digitalized data, feature extraction was done by MFCC.
After that, Singular Value Decomposition (SVD) was done to decompose the data into a single
row with 14 coefficients. The extracted features are then used to train a supervised learning
algorithm, such as a Support Vector Machine (SVM), with 86.7% accuracy for our train data
set, to classify the gender of the speaker. The App Designer tool, Matlab GUI provides a userfriendly interface for users to input speech signals and display the predicted gender. The results
show that the proposed model achieves high accuracy in gender detection on the test data set,
real-life data has an accuracy of 83.33% for male voice prediction, and recorded data has an
accuracy of 90%. Real-life statistics on women are 80% accurate, while recorded data are
86.67% accurate.