Extract audio features
Read in an audio signal.
[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");
Create an audioFeatureExtractor
to extract the centroid of the Bark spectrum, the kurtosis of the Bark spectrum, and the pitch
of an audio signal.
aFE = audioFeatureExtractor("SampleRate",fs, ... "SpectralDescriptorInput","barkSpectrum", ... "spectralCentroid",true, ... "spectralKurtosis",true, ... "pitch",true)
aFE = audioFeatureExtractor with properties: Properties Window: [1024x1 double] OverlapLength: 512 SampleRate: 44100 FFTLength: [] SpectralDescriptorInput: 'barkSpectrum' Enabled Features spectralCentroid, spectralKurtosis, pitch Disabled Features linearSpectrum, melSpectrum, barkSpectrum, erbSpectrum, mfcc, mfccDelta mfccDeltaDelta, gtcc, gtccDelta, gtccDeltaDelta, spectralCrest, spectralDecrease spectralEntropy, spectralFlatness, spectralFlux, spectralRolloffPoint, spectralSkewness, spectralSlope spectralSpread, harmonicRatio To extract a feature, set the corresponding property to true. For example, obj.mfcc = true, adds mfcc to the list of enabled features.
Call extract
to extract the features from the audio signal. Normalize the features by their mean and standard deviation.
features = extract(aFE,audioIn); features = (features - mean(features,1))./std(features,[],1);
Plot the normalized features over time.
idx = info(aFE); duration = size(audioIn,1)/fs; subplot(2,1,1) t = linspace(0,duration,size(audioIn,1)); plot(t,audioIn) subplot(2,1,2) t = linspace(0,duration,size(features,1)); plot(t,features(:,idx.spectralCentroid), ... t,features(:,idx.spectralKurtosis), ... t,features(:,idx.pitch)); legend("Spectral Centroid","Spectral Kurtosis", "Pitch") xlabel("Time (s)")
aFE
— Input objectaudioFeatureExtractor
objectaudioFeatureExtractor
object.
audioIn
— Input audioInput audio, specified as a column vector or matrix of independent channels (columns).
Data Types: single
| double
features
— Extracted audio featuresExtracted audio features, returned as an L-by-M-by-N array, where:
L –– Number of feature vectors (hops)
M –– Number of features extracted per analysis window
N –– Number of channels
Data Types: single
| double
You have a modified version of this example. Do you want to open this example with your edits?