Capture Speech with DirectSound Problem

75eacc32f6dd92c3a68a502171067b83
0
CrazyPlaya 101 Dec 10, 2008 at 09:53

Hi @ all,

i´m a newbie on programming audio.
I try to capture speech from Microphone and streaming fragments of a capture buffer to network.
I send the fragments by catching the Notification Event.
My Capture Buffer is 2560b big and is seperated in 4 segments.
Every segment gets 20ms of speech. The Samplerate is 16kHz with 16 bits per sample.
Here my Code of the Notification Thread.

DWORD capturePos, readPos;
    m_captureBuffer->GetCurrentPosition(&capturePos, &readPos);
    DWORD diff;
    readPos >= m_NextOffset ?
        diff = readPos - m_NextOffset :
        diff = readPos + m_BufferSize - m_NextOffset;
        
    if(diff < m_NotifySize) return;
    
    //Lock the capture Buffer
    VOID* lockedBufferPointer = NULL;
    DWORD lockedBufferSize;
   
    m_captureBuffer->Lock(m_NextOffset, m_NotifySize, &lockedBufferPointer, &lockedBufferSize, NULL, NULL, 0L);
    //Put data to Output Sink
    const int SIZE = lockedBufferSize/2;
    //write to wave file
    write((BYTE*)lockedBufferPointer, lockedBufferSize);
   //edit the sample
    pSink->on_NewAudioDataCaptured((short*)lockedBufferPointer, SIZE);
    
    //Unlock the capture Buffer
    m_captureBuffer->Unlock(lockedBufferPointer, lockedBufferSize, NULL, 0);
       //Move the capture offset along
    m_NextOffset += lockedBufferSize;
    m_NextOffset %= m_BufferSize;

My Problem is that the speech I´m sending is very noisy after capturing.
What could I do wrong by Capturing?
I don´t know on and it´s hard to find some good information about capturing by directsound.

Greetings
Karsten

10 Replies

Please log in or register to post a reply.

75eacc32f6dd92c3a68a502171067b83
0
CrazyPlaya 101 Dec 10, 2008 at 15:19

Ok i have it. The fragment was to small with 20ms. Now i have it with 60ms and it sounds good.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 104 Dec 10, 2008 at 16:05

ive made music programs before, split it up into sinewaves with the furior transform :)

dont worry loading a wav is heaps easier doing it yourself than using direct x products.

[EDIT] just learn off the samples, itys what i do [/EDIT]

but theres no dolby in direct sound.

8676d29610e6c98d6dd2d9c38528cd9c
0
alphadog 101 Dec 10, 2008 at 17:54

I’ve sometimes used furious transforms. It never turns out well… :)

17ba6d8b7ba3b6d82970a7bbba71a6de
0
vrnunes 102 Dec 10, 2008 at 18:15

this is very easy to get ‘clicky’ sounds, with improper sampling and/or mixing.

in this specific case, i think the playback quality is being affected by network latency. good to see you’ve got a solution already.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 104 Dec 11, 2008 at 11:08

ive implemented it, with no phase smearing, or very little.
very good results. :)

i also coded the furious transform myself, even tho i cant even
say the name properly. hehe

im actually implementing a new wave file format that stores the oscillator
positions of every harmonic over time - so it will help with latency for
pre recorded sounds (samples), just not live inputs - they are the problem.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 104 Dec 11, 2008 at 11:24

i dont recommend using the fft, the dft is simpler and does a better job.

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 Dec 11, 2008 at 17:44

Rouncer, the FFT is just a fast algorithm to compute the DFT. It gives the same results.

The FFT is more complex to implement, but you can also just use libfftw, which is very easy.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 104 Dec 12, 2008 at 12:52

maybe it does change the sound…

B91eae75cd6245bd8074bd0c3f1cc495
0
Nils_Pipenbrinck 101 Dec 12, 2008 at 13:57

@rouncer

maybe it does change the sound…

No, it does not. As Reedbeta said it’s the same thing, just a faster algorithm. Think bubblesort and quicksort. Both algorithms compute the same result eventually, but quicksort is faster (most of the time at last).

FFT are a little tricky to use if you don’t fully understand which one to pick. A FFT that is speed-optimized to give only 8 bit results but takes 16 bit inputs will sound bad due to roundoff-errors and truncating for example. These are fine for graphics but suck for audio.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 104 Dec 12, 2008 at 19:12

I know your probably right, but my experiments in making a phase vocoder (which turned out successful, if you remember me asking silly questions about the fft a while back) is the longer a section you pass to it the more wooshy the sound comes back out of it, and coding the dft finally myself i could actually pick off each harmonic one by one and thats what gave it to me.

but ill definitely learn the fft eventually, so i dont really know what im talking about yet.

the woosh you get out of it sounds a little like reverb, except its a bit different and it sounds real nasty and im making music with it now, its totally awesome. harmonic domain effects are the future of digital.

turning every sine wave into a square wave individually is a new kind of expensive distortion, and theres opportunity for chorusing also.