Root mean square normalization in Python#


Audio normalization is a fundamental audio processing technique that consists of applying a constant amount of gain to an audio in order to bring its amplitude to a target level. A commonly used normalization technique is the Root Mean Square (RMS) normalization. This blog post introduces RMS normalization and provides a Python implementation of it.

What is RMS normalization?#

In general there are two principal types of audio normalization:

  • Peak normalization which adjusts the recording based on its highest signal level.

  • Loudness normalization which adjusts the recording based its perceived loudness.

RMS normalization falls under the latter, where the perceived loudness level is determined using the root mean square of the signal. The result is then used to compute the gain value used in the normalization. Since the gain value is constant and applied across the entire recording, the normalization does not affect the signal-to-noise ratio and the relative dynamics 1. The approach to RMS normalization can be summarized in the following mathematical formula 2:

\begin{equation} y[n]=\sqrt{\frac{N-10\left(\frac{r}{20}\right)}{\sum_{i=0}^{N-1} x^{2} \left[ i\right]}} \cdot x[n] \end{equation}

where:

  • \(x[n]\) is the original signal.

  • \(y[n]\) is the normalized signal.

  • \(N\) is the length of \(x[n]\).

  • \(r\) is the input RMS level in dB.

How to implement it in Python?#

Implementing the RMS normalization is fairly simple in Python and the algorithm can be summarized in the following steps:

  • Read audio as an array.

  • Compute the linear RMS level and its scaling factor

  • Normalize using the scaling factor.

  • Write the resulting array as an audio.

 1def normalize(infile, rms_level=0):
 2    """
 3    Normalize the signal given a certain technique (peak or rms).
 4    Args:
 5        - infile    (str) : input filename/path.
 6        - rms_level (int) : rms level in dB.
 7    """
 8    # read input file
 9    fs, sig = read_file(filename=infile)
10
11    # linear rms level and scaling factor
12    r = 10**(rms_level / 10.0)
13    a = np.sqrt( (len(sig) * r**2) / np.sum(sig**2) )
14
15    # normalize
16    y = sig * a
17
18    # construct file names
19    output_file_path = os.path.dirname(infile)
20    name_attribute = "output_file.wav"
21
22    # export data to file
23    write_file(output_file_path=output_file_path,
24               input_file_name=infile,
25               name_attribute=name_attribute,
26               sig=y,
27               fs=fs)

This implementation is available as part of the Pydiogment_library

Conclusion#

This blog post provided a small introduction of the RMS normalization technique, which is commonly used in speech processing to improve the quality of recordings. We also provided a small implementation of the approach that is part of the Pydiogment_library.

Share this blog#

Updated on 08 April 2022

👨‍💻 edited and review were on 08.04.2022

References and Further readings#

1

Matt Shelvock. Audio Mastering as Musical Practice. Master's thesis, The University of Western Ontario: The School of Graduate and Postdoctoral Studies, London, Ontario, Canada, 2012. URL: https://ir.lib.uwo.ca/etd/530.

2

Ayoub Malek and Hasna Marwa Malek. Pydiogment: A Python package for audio augmentation. 2020. [Online; accessed 30.04.2020]. URL: https://github.com/SuperKogito/pydiogment/blob/master/paper/paper.pdf.