Pitch Detection via Autocorrelation Method:
A commonly used method to estimate pitch (fundamental frequency) is based on detecting the highest value of the autocorrelation function in the region of interest. Our perception of pitch is strongly related to periodicity in the waveform in the time domain. A method to estimate fundamental frequency from the waveform directly is to use autocorrelation.
The statistical autocorrelation of a sinusoidal random process
is given by
which has maxima for m = lT0, the pitch period and its harmonics, so that we can find the pitch period by computing the highest value of the autocorrelation. Similarly, it can be shown that any WSS periodic process with period T0 also has an autocorrelation which exhibits its maxima at m = lT0.
In practice, we need to obtain an estimate from knowledge of only N samples.
The empirical autocorrelation function is given by,
where is a window function of length N.
For the random process in Eq. (1) results in an expected value of
whose maximum coincides with the pitch period for m > m0.
Since pitch periods can be as low as 40Hz (for a very low-pitched male voice) or as high as 600 Hz (for a very high-pitched female or child’s voice), the search for the maximum is conducted within a region.
Waveform (My voice recording “bee”) and unsmoothed pitch track with autocorrelation
method. Here the pitch values in the weakly voiced or unvoiced regions are essentially random.
Waveform and autocorrelation function for frame 5 in the previous Fig.. The estimated pitch is 156.863Hz.