I know this is an old post but if you are looking for f0 (pitch) the easiest method is to use the harmonic product spectrum. Its really pretty simple. You take an FFT of the spectrum you are looking. Next make a copy of that spectrum but half the size (miss out every other sample) and add it to a copy of the original spectrum. Then take another copy of the original fft but miss out every 2 samples and add it again and so on a few times.
In the resulting "FFT" you have you will find the largest peak is the pitch.
This method works, is increidbly quick, but you get really rubbish resolution (its not bad for most things though).
The method I'm currently using for speech pitch tracking is based on an auto correlation. Basically you take an auto correlation of the window (as in window functioned window of audio) you wish to analyse and then you measure the distance to the next highest peak (This is complicated as suually the peak will lie between 2 samples so you need to break out the interpolation). This distance is the sample offset to your fundamental frequency (pitch) in samples. It is very fiddly getting this right but gives by FAR the best resolution I've come across. Its a very intensive calculation though! The bonus of this method is that by applying a levinson-durbin recursion to the resulting auto correlation data you can easily convert to an LPC representation. Find the roots and you have F1 through Fn as well.
GozMember Since 07 Sep 2004
Offline Last Active Sep 11 2012 05:19 PM
- Group Members
- Active Posts 576 (0.18 per day)
- Most Active In Graphics Theory & Programming (381 posts)
- Profile Views 751
- Member Title Senior Member
- Age Age Unknown
- Birthday Birthday Unknown
Goz hasn't added any friends yet.