Melspectogram with Python

참고 : https://www.youtube.com/watch?v=fMqL5vckiU0&list=PL-wATfeyAMNrtbkCNsLcpoAyBBRJZVlnf


1. Import Packges & Datasets

import librosa
import librosa.display
import IPython.display as ipd
import matplotlib.pyplot as plt

scale_file = "audio/scale.wav"
scale, sr = librosa.load(scale_file)


2. Mel filter banks

num_bands = 10
filter_banks = librosa.filters.mel(n_fft=2048, sr=22050, n_mels=num_bands)
print(filter_banks.shape)
(10,1025)
  • 10 : number of bands

  • 1025 = (2048/2+1)

    • 2048 : size of each frame

      ( = 2048 samples in a single frame )


Visualize Mel filter banks

plt.figure(figsize=(25, 10))
librosa.display.specshow(filter_banks, 
                         sr=sr, 
                         x_axis="linear")
plt.colorbar(format="%+2.f")
plt.show()

figure2


3. Melspectogram

Apply Mel filter banks ($M$) to Spectogram ($Y$)

mel_spectrogram = librosa.feature.melspectrogram(scale, sr=sr, 
                                                 n_fft=2048, 
                                                 hop_length=512, 
                                                 n_mels=10)
print(mel_spectrogram.shape)
(10,342)
  • 10 : number of bands
  • 342 : number of frames


Convert it into log scale

log_mel_spectrogram = librosa.power_to_db(mel_spectrogram)


Visualization

plt.figure(figsize=(25, 10))
librosa.display.specshow(log_mel_spectrogram, 
                         x_axis="time",
                         y_axis="mel", 
                         sr=sr)
plt.colorbar(format="%+2.f")
plt.show()

figure2

Categories: ,

Updated: