spafe.features.lpc#

  • Description : Linear Prediction Components and Cepstral Coefficients (LPCs and LPCCs) extraction algorithm implementation.

  • Copyright (c) 2019-2023 Ayoub Malek. This source code is licensed under the terms of the BSD 3-Clause License. For a copy, see <https://github.com/SuperKogito/spafe/blob/master/LICENSE>.

spafe.features.lpc.lpc(sig, fs: int = 16000, order=13, pre_emph: bool = True, pre_emph_coeff: float = 0.97, window: Optional[SlidingWindow] = None)[source]#

Compute the Linear prediction coefficents (LPC) from an audio signal.

Parameters
  • sig (numpy.ndarray) – a mono audio signal (Nx1) from which to compute features.

  • fs (int) – the sampling frequency of the signal we are working with. (Default is 16000).

  • order (int) – order of the LP model and number of cepstral components. (Default is 13).

  • pre_emph (bool) – apply pre-emphasis if 1. (Default is 1).

  • pre_emph_coeff (float) – pre-emphasis filter coefficient. (Default is 0.97).

  • window (SlidingWindow) – sliding window object. (Default is None).

Returns

  • (numpy.ndarray) : 2d array of LPC features (num_frames x num_ceps).

  • (numpy.ndarray) : The error term is the sqare root of the squared prediction error.

Return type

(tuple)

Note

../_images/lpcs.png

Architecture of linear prediction components extraction algorithm.#

Examples
from scipy.io.wavfile import read
from spafe.features.lpc import lpc
from spafe.utils.preprocessing import SlidingWindow
from spafe.utils.vis import show_features

# read audio
fpath = "../../../tests/data/test.wav"
fs, sig = read(fpath)

# compute lpcs
lpcs, _ = lpc(sig,
              fs=fs,
              pre_emph=0,
              pre_emph_coeff=0.97,
              window=SlidingWindow(0.030, 0.015, "hamming"))

# visualize features
show_features(lpcs, "Linear prediction coefficents", "LPCs Index", "Frame Index")
../_images/lpc-1.png
spafe.features.lpc.lpc2lpcc(a, e, nceps)[source]#

Convert linear prediction coefficents (LPC) to linear prediction cepstral coefficients (LPCC) as described in [Rao] and [Makhoul].

Parameters
  • a (numpy.ndarray) – linear prediction coefficents.

  • order (int) – linear prediction model order.

  • nceps (int) – number of cepstral coefficients.

Returns

linear prediction cepstrum coefficents (LPCC).

Return type

(numpy.ndarray)

Note

\[\begin{split}C_{m}=\left\{\begin{array}{l} log_{e}(p), & \text{if } m = 0 \\ a_{m} + \sum_{k=1}^{m-1} \frac{k}{m} C_{m} a_{m-k} , & \text{if } 1 < m < p \\ \sum_{k=m-p}^{m-1} \frac{k}{m} C_{m} a_{m-k} , & \text{if } m > p \end{array}\right.\end{split}\]

References

Makhoul

: Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561–580. doi:10.1109/proc.1975.9792

Rao

: Rao, K. S., Reddy, V. R., & Maity, S. (2015). Language Identification Using Spectral and Prosodic Features. SpringerBriefs in Electrical and Computer Engineering. doi:10.1007/978-3-319-17163-0

spafe.features.lpc.lpcc(sig: ndarray, fs: int = 16000, order=13, pre_emph: bool = True, pre_emph_coeff: float = 0.97, window: Optional[SlidingWindow] = None, lifter: Optional[int] = None, normalize: Optional[Literal['mvn', 'ms', 'vn', 'mn']] = None) ndarray[source]#

Computes the linear predictive cepstral components / coefficents from an audio signal.

Parameters
  • sig (numpy.ndarray) – input mono audio signal (Nx1).

  • fs (int) – the sampling frequency of the signal. (Default is 16000).

  • order (int) – order of the LP model and number of cepstral components. (Default is 13).

  • pre_emph (bool) – apply pre-emphasis if 1. (Default is 1).

  • pre_emph_coeff (float) – pre-emphasis filter coefficient. (Default is 0.97).

  • window (SlidingWindow) – sliding window object. (Default is None).

  • lifter (int) – apply liftering if specified. (Default is None).

  • normalize (str) – apply normalization if provided. (Default is None).

Returns

2d array of LPCC features (num_frames x num_ceps)

Return type

(numpy.ndarray)

Tip

  • normalize : can take the following options [“mvn”, “ms”, “vn”, “mn”].

Note

Returned values are in the frequency domain

../_images/lpccs.png

Architecture of linear prediction cepstral coefficients extraction algorithm.#

Examples
from scipy.io.wavfile import read
from spafe.features.lpc import lpcc
from spafe.utils.preprocessing import SlidingWindow
from spafe.utils.vis import show_features

# read audio
fpath = "../../../tests/data/test.wav"
fs, sig = read(fpath)

# compute lpccs
lpccs = lpcc(sig,
             fs=fs,
             pre_emph=0,
             pre_emph_coeff=0.97,
             window=SlidingWindow(0.03, 0.015, "hamming"))

# visualize features
show_features(lpccs, "Linear Prediction Cepstral Coefficients", "LPCCs Index","Frame Index")
../_images/lpc-2.png