Melspectogram with Python
참고 : https://www.youtube.com/watch?v=fMqL5vckiU0&list=PL-wATfeyAMNrtbkCNsLcpoAyBBRJZVlnf
1. Import Packges & Datasets
import librosa
import librosa.display
import IPython.display as ipd
import matplotlib.pyplot as plt
scale_file = "audio/scale.wav"
scale, sr = librosa.load(scale_file)
2. Mel filter banks
num_bands = 10
filter_banks = librosa.filters.mel(n_fft=2048, sr=22050, n_mels=num_bands)
print(filter_banks.shape)
(10,1025)
-
10 : number of bands
-
1025 = (2048/2+1)
-
2048 : size of each frame
( = 2048 samples in a single frame )
-
Visualize Mel filter banks
plt.figure(figsize=(25, 10))
librosa.display.specshow(filter_banks,
sr=sr,
x_axis="linear")
plt.colorbar(format="%+2.f")
plt.show()
3. Melspectogram
Apply Mel filter banks ($M$) to Spectogram ($Y$)
mel_spectrogram = librosa.feature.melspectrogram(scale, sr=sr,
n_fft=2048,
hop_length=512,
n_mels=10)
print(mel_spectrogram.shape)
(10,342)
- 10 : number of bands
- 342 : number of frames
Convert it into log scale
log_mel_spectrogram = librosa.power_to_db(mel_spectrogram)
Visualization
plt.figure(figsize=(25, 10))
librosa.display.specshow(log_mel_spectrogram,
x_axis="time",
y_axis="mel",
sr=sr)
plt.colorbar(format="%+2.f")
plt.show()