HRTF engine?
#1
Posted 23 March 2010 - 10:26 PM
The concept of HRTF is that your brain can determine the direction and distance of a sound based on the amplitude AND phase discrepancies between what each ear hears. Most of your stereo mixes for music are solely amplitude panning, and I find that any of the games that I play are the same thing.
I've been playing around with the idea of writing a matrix of algorithms to calculate the phase difference of each ear using HRTF formulae and being able to use it as a sound rendering engine for a 3d game (FP would be the best way to utilize this) of course it would require the user to have a set of headphones because it wouldn't work with speakers.
I thought I'd prototype this in Max MSP and see if I could compile it to an application that would allow you to spatialize in 3 dimensions a mono sound source and then try to re create the same program using a traditional programming language for use in whatever application may utilize it.. (kind of like havoc for physics) has anybody seen or heard anything about this? or is anybody interested in the idea?
i need some feedback.
If you're curious about what the heck i'm talking about google "virtual haircut", download the mp3 and give it a listen with some headphones (not earbuds)
#2
Posted 23 March 2010 - 11:28 PM
#4
Posted 24 March 2010 - 02:42 AM
#5
Posted 24 March 2010 - 03:02 AM
Reedbeta said:
There should be projects which try to read web-camera input to track if user turned his head. Examples are XBOX games, tracking user face or silhouette, which require calibration before use (get out of camera, game makes shot, stand in camera, make another shot, then use difference between frames to find contour of player.
#6
Posted 24 March 2010 - 03:42 AM
#7
Posted 24 March 2010 - 08:12 AM
This would be an iterative process, and I might end up spending much of my spare time the next couple years at university doing it.
But I wasn't worried to much about the movement because if the application is a 3d game, your eyes are always fixed on the screen, but that would be a different story if you had a VR helmet, as an example.
The other hurdle is that HRTF as a concept is universal, but the formulae aren't, because every person's HRTF is different. We all have different shaped heads and our brains work differently in interpreting those phase and amplitude cues. What the HRTF researchers use is an average value that should work for most people. I think to get around that I'd have to make a completely new set of algorithms to change the matrix of algorithms depending on the measurements of your head. or to make it simpler, just have a few different presets.
It works fine for me, but one person I spoke to said that if the sound passes anywhere the illusion works for them unless it goes straight in front, then it just sounds like a stereo mix rather than a phantom sound source in front of them. It could be completely different for each person.
#8
Posted 24 March 2010 - 11:48 AM
#9
Posted 24 March 2010 - 02:27 PM
If you think of your typical consumer audio standards you can add dimensions to them.
If we ignore the cues that we mentally process that tells us things like depth (like a reverberant recording will give us a mental cue as to a sound that is farther away than another etc,) because technically there is no spatialization there. its just a mental trick.
Mono would be 0 dimensions. it is just a point. all the sound from one source.
Stereo would be 1 dimension, because you can plot the sound source on one line.
4.1, 5.1 6.1 7.1 ... etc would be 2 dimensional because you can plot a sound in X,Y but you have no height. You would have to add an elevated speaker array to get a true 3 dimensional representation of the sound.
So rather than trying to render a 5.1 mix as a stereo binaural mix that uses HRTF. which would probably give you a half decent illusion of where things were in 2 dimensions. you still loose that 3rd dimension.. which to me would probably ruin the entire effect, because sound doesn't work that way.
What I'm talking about is taking a mono sound and being able to position it somewhere in 3 dimensions while only using a stereo feed, which in this case would have to be headphones.
I suppose if there are soundcards that will calculate HRTF in realtime then that would basically already do what I'm talking about. But it would have to be able to read information from the application as far as position of sound, the geometry of its surroundings, the makeup of its surroundings etc. and simulate that 3d space for your brain.
So I'm talking about doing that ^^^ at a software level.
and do develop an application that would allow you to do it realtime, or to render it to a file.
#10
Posted 24 March 2010 - 02:55 PM
Anyway, it would be cool to have head tracking with HRTF in an FPS (: Even though you look the screen straight most of the time, you can still turn head left/right ~30 degrees while still looking at the screen. So it would be extra natural control for checking audio cues around you by turning your head around (:
#11
Posted 24 March 2010 - 03:37 PM
JarkkoL said:
good point.
My prof is actually excited about this aspect of it and I've got all the tools of the university available to me.
#12
Posted 24 March 2010 - 03:52 PM
#13
Posted 24 March 2010 - 04:43 PM
I'll share what I find.
#14
Posted 10 April 2012 - 06:30 AM
First up, is http://slab3d.sonisphere.com/. Slab3d performs spatial 3D-sound processing allowing the arbitrary placement of sound sources in auditory space. It is released free to the public under the NASA Open Source Agreement.
I just found this about an hour ago but my initial testing with it shows promise.
My first suggestion is looking into using the microsoft kinect for both head tracking, and possibly even head scanning for personalised hrtfs. With some (admittedly difficult) work, this could be a very slick and user friendly solution.
My second suggestion for further development after getting the head tracking and audio processing working, is to replace the headphones with a pair of targeted parametric speaker arrays, such as http://www.soundlazer.com/. At this point we would already have ear coordinates and may be able to exploit our existing dsp framework to neutralise the newly introduced natural hrtf effects and replace them with those from the virtual environment. The use of a microphone to capture and compensate for room response would also be useful.
My dream would be to see an open standard/api/whatever to make fully immersive head-tracked virtual audio easy to incorporate into any software/hardware product. Games are the obvious choice for this technology but it could also be used to improve immersion and lower fatigue when listening to music with headphones, or any other application where a virtual audio environment would be useful.
Good luck with your research and development, I would have loved to have worked on this myself but in truth I don't really have the programming or academic experience nescessary for something like this. So here I am setting free my ideas into the public domain, where I pray they will blossom and prosper with the help of others. Please keep us up to date on your progress.
#15
Posted 10 April 2012 - 08:27 AM
Head tracking can be done with a webcam and some software, you run a few filters on the input image and reduce it to a black and white image with blobs for the eyes nostrils and mouth. That's the only tricky bit, the filters can be affected by ambient light conditions and give false readings if you are not careful.
As far as I remember it I did....
Background removal, grey scale, rectify, sharpen, blob detect.
I may have missed a filter, was a few years ago now. I stopped working in this field when the company ran out of money to pay me.
The other thing you can try is a 3D camera. They are not that expensive anymore and give excellent results.
#16
Posted 02 May 2012 - 01:31 PM
You could check time of arrival and time difference of arrival relative to the sound source. Grab a baseline by asking the user to be steady relative to screen, sample, then calibrate by asking them to turn head, etc. Thinking out loud here...
Or, lots of phones have accelerometers. Ask them to calibrate by putting their phones on their heads. Hmm, excuse me while I go visit the trademark and patent offices...
#17
Posted 02 May 2012 - 02:22 PM
(I know som of the VR-helmets of the early '90s had integrated head tracking. Possibly via magnetic field sensors or something.)
#18
Posted 02 May 2012 - 02:29 PM
The headset was so heavy though, and the strap attachment had to be so tight it was akin to torture.
There are some very good direct injection headsets now, though they tend to generate a virtual screen some distance in front of you.
#19
Posted 06 May 2012 - 04:14 AM
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users












