Using the Feature Extractor

Classify music as blues, classical, country, disco, hip-hop, jazz, metal, pop, reggae, or rock

2024-12-15.00-12-09.mp4

I include along with an interface for the tool made to extract features, sample code for a DNN and SVM, as well as the source for the website used to host my trained model. The method was to extract mel frequency capstone coeficients from the data for use in a Concurrent Neural Network (CNN) for preprocessing and feature reduction to be used as inputs to a Support Vector Machine (SVM) for classification. I was able to get 91.5% accuracy on the test set using an RBF Kernel. The SVM outperformed a Dense Nueral Network (DNN).

GTZAN Dataset — I only use the raw audio and labels.

app.py contains a flask app that when run in the browser will return a spectrogram image and genre prediction.

Using the Feature Extractor

I don't include my trained model in this repository

from FeatureExtract import FeatureExtract

fe = FeatureExtract()
x, y = fe.load_data('path/to/gtzan_wavs', 'path/to/gtzan_csv')
fe.train(features=x, labels=y, predict=False)
features = fe.extract(x) # returns a tf.Tensor
features = features.numpy() # to get a numpy array

Subsequent use of the mfcc values can be made quicker

These are not the values extracted from the CNN! Use these to train multiple CNNs.

fe.save_csv('mfccs.csv', features, labels)
x, y = fe.load_csv('mfccs.csv')

To save and load the CNN model after training

fe.load_model(path)
fe.save_model(path=None, overwrite=False)

You may then use this CNN feature extractor as input to some other classifier.

Example use with SVM Classifier

fe = FeatureExtract()
fe.load_model('cnn5.keras')
x, y = fe.load_csv('features.csv')

features = fe.extract(x).numpy()
print(features.shape)

label_encoder = LabelEncoder()
labels_encoded = label_encoder.fit_transform(y)

x_train, x_test, y_train, y_test = train_test_split(features, labels_encoded, test_size=0.2, random_state=42, stratify=labels_encoded)

clf = svm.SVC()
clf.fit(x_train, y_train)
predictions = clf.predict(x_test)

# confusion matrix
predicted_labels = label_encoder.inverse_transform(predictions)
true_labels = label_encoder.inverse_transform(y_test)
cm = confusion_matrix(true_labels, predicted_labels, labels=label_encoder.classes_)
display = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=label_encoder.classes_)
display.plot(cmap=plt.cm.Blues)
plt.xticks(rotation=45)
plt.title('Confusion Matrix')
plt.show()

# accuracy
accuracy = np.sum(predictions == y_test) / len(y_test)
print (f"accuracy: {accuracy}")

Collaborative Paper

Comparison of Support Vector Machines, Neural Networks, and Naive Bayes for Classifying Audio Snippets Within Music Genres

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
output		output
static		static
templates		templates
Comparison of Support Vector Machines, Neural Networks, and Naive Bayes for Classifying Audio Snippets Within Music Genres.pdf		Comparison of Support Vector Machines, Neural Networks, and Naive Bayes for Classifying Audio Snippets Within Music Genres.pdf
FeatureExtract.py		FeatureExtract.py
README.md		README.md
app.py		app.py
dnn.py		dnn.py
mfccs.csv		mfccs.csv
requirements.txt		requirements.txt
svm.py		svm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Using the Feature Extractor

Subsequent use of the mfcc values can be made quicker

To save and load the CNN model after training

Example use with SVM Classifier

Collaborative Paper

About

Uh oh!

Releases

Packages

Uh oh!

Languages

edibblepdx/music-genre-classifier

Folders and files

Latest commit

History

Repository files navigation

Using the Feature Extractor

Subsequent use of the mfcc values can be made quicker

To save and load the CNN model after training

Example use with SVM Classifier

Collaborative Paper

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages