Capture Speech with DirectSound Problem

CrazyPlaya 101 Dec 10, 2008 at 09:53

Hi @ all,

i´m a newbie on programming audio.
I try to capture speech from Microphone and streaming fragments of a capture buffer to network.
I send the fragments by catching the Notification Event.
My Capture Buffer is 2560b big and is seperated in 4 segments.
Every segment gets 20ms of speech. The Samplerate is 16kHz with 16 bits per sample.
Here my Code of the Notification Thread.

DWORD capturePos, readPos;
    m_captureBuffer->GetCurrentPosition(&capturePos, &readPos);
    DWORD diff;
    readPos >= m_NextOffset ?
        diff = readPos - m_NextOffset :
        diff = readPos + m_BufferSize - m_NextOffset;
    if(diff < m_NotifySize) return;
    //Lock the capture Buffer
    VOID* lockedBufferPointer = NULL;
    DWORD lockedBufferSize;
    m_captureBuffer->Lock(m_NextOffset, m_NotifySize, &lockedBufferPointer, &lockedBufferSize, NULL, NULL, 0L);
    //Put data to Output Sink
    const int SIZE = lockedBufferSize/2;
    //write to wave file
    write((BYTE*)lockedBufferPointer, lockedBufferSize);
   //edit the sample
    pSink->on_NewAudioDataCaptured((short*)lockedBufferPointer, SIZE);
    //Unlock the capture Buffer
    m_captureBuffer->Unlock(lockedBufferPointer, lockedBufferSize, NULL, 0);
       //Move the capture offset along
    m_NextOffset += lockedBufferSize;
    m_NextOffset %= m_BufferSize;

My Problem is that the speech I´m sending is very noisy after capturing.
What could I do wrong by Capturing?
I don´t know on and it´s hard to find some good information about capturing by directsound.


10 Replies

Please log in or register to post a reply.

CrazyPlaya 101 Dec 10, 2008 at 15:19

Ok i have it. The fragment was to small with 20ms. Now i have it with 60ms and it sounds good.

rouncer 104 Dec 10, 2008 at 16:05

ive made music programs before, split it up into sinewaves with the furior transform :)

dont worry loading a wav is heaps easier doing it yourself than using direct x products.

[EDIT] just learn off the samples, itys what i do [/EDIT]

but theres no dolby in direct sound.

alphadog 101 Dec 10, 2008 at 17:54

I’ve sometimes used furious transforms. It never turns out well… :)

vrnunes 102 Dec 10, 2008 at 18:15

this is very easy to get ‘clicky’ sounds, with improper sampling and/or mixing.

in this specific case, i think the playback quality is being affected by network latency. good to see you’ve got a solution already.

rouncer 104 Dec 11, 2008 at 11:08

ive implemented it, with no phase smearing, or very little.
very good results. :)

i also coded the furious transform myself, even tho i cant even
say the name properly. hehe

im actually implementing a new wave file format that stores the oscillator
positions of every harmonic over time - so it will help with latency for
pre recorded sounds (samples), just not live inputs - they are the problem.

rouncer 104 Dec 11, 2008 at 11:24

i dont recommend using the fft, the dft is simpler and does a better job.

Reedbeta 167 Dec 11, 2008 at 17:44

Rouncer, the FFT is just a fast algorithm to compute the DFT. It gives the same results.

The FFT is more complex to implement, but you can also just use libfftw, which is very easy.

rouncer 104 Dec 12, 2008 at 12:52

maybe it does change the sound…

Nils_Pipenbrinck 101 Dec 12, 2008 at 13:57


maybe it does change the sound…

No, it does not. As Reedbeta said it’s the same thing, just a faster algorithm. Think bubblesort and quicksort. Both algorithms compute the same result eventually, but quicksort is faster (most of the time at last).

FFT are a little tricky to use if you don’t fully understand which one to pick. A FFT that is speed-optimized to give only 8 bit results but takes 16 bit inputs will sound bad due to roundoff-errors and truncating for example. These are fine for graphics but suck for audio.

rouncer 104 Dec 12, 2008 at 19:12

I know your probably right, but my experiments in making a phase vocoder (which turned out successful, if you remember me asking silly questions about the fft a while back) is the longer a section you pass to it the more wooshy the sound comes back out of it, and coding the dft finally myself i could actually pick off each harmonic one by one and thats what gave it to me.

but ill definitely learn the fft eventually, so i dont really know what im talking about yet.

the woosh you get out of it sounds a little like reverb, except its a bit different and it sounds real nasty and im making music with it now, its totally awesome. harmonic domain effects are the future of digital.

turning every sine wave into a square wave individually is a new kind of expensive distortion, and theres opportunity for chorusing also.