latency and fuzzy echos in mic recording

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 May 12, 2012 at 13:41

this code snippet takes capture from the mic and then pumps it to the output stream.
it has a little latency, and it has lots of bugs in the output.
if i switch it round and capture sound the frame before i write it, it cleans up but then theres even more latency, i need less latency as possible so i can use the computer as an amp for my guitar or voice… is this possible with direct sound?

void write_plays(void)
{
    
    
    
    LPVOID lpvPtr1;
 DWORD dwBytes1;
 LPVOID lpvPtr2;
 DWORD dwBytes2;

    LPVOID clpvPtr1;
 DWORD cdwBytes1;
 LPVOID clpvPtr2;
 DWORD cdwBytes2;


 static int offset=0;
 static int last_clock=timeGetTime();
 int clock=timeGetTime();
 
 int tp=clock-last_clock;
 
 int write=(int)((float)tp/1000.0f*44100.0f);
 
 static int cwp=0;


 HRESULT hr;

    DWORD cp,rp;

    capture->GetCurrentPosition(&cp,&rp);
    
    DWORD locksize;

    if(rp>cp)
    {
     locksize=(capture_size-rp)+cp;
    }
    else
    {
     locksize=cp-rp;
    }


    rwp=cwp;

    hr=capture->Lock(rp, locksize, &clpvPtr1, &cdwBytes1, &clpvPtr2, &cdwBytes2, 0);
 if(hr==S_OK)
    {
  SHORT* us=(SHORT*)clpvPtr1;
  SHORT* us2=(SHORT*)clpvPtr2;
  DWORD  b=(cdwBytes1>>1);
  DWORD  b2=(cdwBytes2>>1);


  int j;
  for(j=0;j<locksize;j++)
  {    
   int wab;
            if(j*2<(signed)B)
   {
       wab=us[j*2+0];
    wab+=us[j*2+1];
    wab/=2;
          capture_copy[cwp]=(short)wab;
             cwp++;
                if(cwp==1000000) cwp=0;
            }
   else if((j*2-B)<b2)
   {
       wab=us2[(j*2-B)+0];
       wab+=us2[(j*2-B)+1];
    wab/=2;
          capture_copy[cwp]=(short)wab;
             cwp++;
                if(cwp==1000000) cwp=0;
   }
 
        }
  capture->Unlock(clpvPtr1, cdwBytes1, clpvPtr2, cdwBytes2);
    }

    DWORD pc,wc;

    primary->GetCurrentPosition(&pc,&wc);

 // Obtain write pointer.
 hr=primary->Lock(wc%primary_size, primary_size, &lpvPtr1, &dwBytes1, &lpvPtr2, &dwBytes2, 0);
 if(hr==S_OK)
 {
  SHORT* us=(SHORT*)lpvPtr1;
  SHORT* us2=(SHORT*)lpvPtr2;
  DWORD  b=(dwBytes1>>1);
  DWORD  b2=(dwBytes2>>1);


  int j;
  for(j=0;j<write;j++)
  {    
   int wa=0;

            int read=rwp+j;
            if(read>=1000000) read-=1000000;
            if(read<0) read+=1000000;
            wa+=capture_copy[read];

   if(wa>32766) wa=32766;
   if(wa<-32766) wa=-32766;
   short wab=(short)wa;

         ram_copy[ram_write]=wab;

            ram_write++;

            if(ram_write==100000000) ram_write=0;

//   short wab=saw;
         
   if(j*2<(signed)B)
   {
    us[j*2+0]=wab;
    us[j*2+1]=wab;
   }
   else if((j*2-B)<b2)
   {
    us2[(j*2-B)+0]=wab;
    us2[(j*2-B)+1]=wab;
   }
  }
  primary->Unlock(lpvPtr1, dwBytes1, lpvPtr2, dwBytes2);
 }


 offset+=write;
 last_clock=clock;
}

17 Replies

Please log in or register to post a reply.

6eaf0e08fe36b2c23ca096562dd7a8b7
0
__________Smile_ 101 May 12, 2012 at 14:31

I thied the same thing years ago and eventually give up: Windows simply don’t have realtime sound API. There is some hacks used by JACK, but I dunno about sound cards coverage.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 May 12, 2012 at 15:47

damno… yeh i cleared it up, but its a SECOND late, its bad latency… im not giving up yet tho, if i got to maybe a third of a second somehow…

6eaf0e08fe36b2c23ca096562dd7a8b7
0
__________Smile_ 101 May 12, 2012 at 16:20

Human ear can hear delays of several tens of msec. 1/3 of second is as bad as 1. Try JACK API, if it works it’ll be much better.

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 May 12, 2012 at 17:34

I’ve been using VSTHost with ASIO4ALL for this. The ASIO driver decreases the latency quite a lot (although it’s still noticable), and should work with most any Windows setup; you can use the free VST SDK to build your own audio processing plugins, though I haven’t done this.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 May 12, 2012 at 17:57

working with direct sound a bit more, i think i should be able to get it to 10ms, im pretty sure, ive just got to work out how the thing works… ill be back if im successful.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 May 26, 2012 at 10:03

naught to my victory, but what the hell is ASIO, do you need a special sound card for it? my computer doesnt seem to work with fruity loops ASIO4ALL.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 May 26, 2012 at 10:29

actually, i got it to about \~100 milliseconds or something… but its still not really perfect enough for playing a guitar into or whatever. using the computer as an amp is actually impossible, which sucks. unless its not?

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 May 26, 2012 at 18:08

Hmm, that’s too bad. I got good enough latency for guitar playing with ASIO4ALL, but I haven’t measured exactly what it is. You’d probably need some kind of high-end hardware to do much better. Consumer PC hardware/software are just not designed for real-time, minimum latency audio processing

820ce9018b365a6aeba6e23847f17eda
0
geon 101 May 26, 2012 at 19:25

@Reedbeta

You’d probably need some kind of high-end hardware to do much better.

I believe there is an iPhone app that does realtime autotune on hardware as early as the 3G s.

I also remember playing with a mic on my 486 dx2 on win 3.11 and using the computer as an amp. I also did exactly that on my P2 with win2k. They might have had physical wires for that, but I’m not sure.

6837d514b487de395be51432d9cdd078
0
TheNut 179 May 26, 2012 at 19:34

Are you sure you can’t go lower than 100msec? I haven’t touched my DirectX Sound capture code in a long time, but I just wrote a crude demo capturing audio at 10msec. It could probably go lower, but I’d have to invest more time to do it properly. The sound buffer should also be able to go just as low, so long as you can write the audio data to hardware fast enough. I remember doing some mic stuff way back in the day and I don’t recall this being a problem. 100msec is for mobile phones, not desktop PCs ;)

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 May 27, 2012 at 02:05

TheNut… so you actually got direct sound to perform at 10ms? I wouldnt mind having a look at your code!

DWORD cp2,rpw2;
capture->GetCurrentPosition(&cp2,&rpw2);

rp=rpw2-(wri+1)*4; //wri is the amount of samples to read

capture->Lock(rp, capture_size, &clpvPtr1, &cdwBytes1, &clpvPtr2, &cdwBytes2, 0);

thats the position of the capture buffer im reading from, just behind the read cursor… is this the most quickest way to get to the data possible?

(note on the thread im reading and outputting every 10 ms, but there is more latency than that for some reason.)

6837d514b487de395be51432d9cdd078
0
TheNut 179 May 27, 2012 at 11:43

The API use is the same, but going about copying the data is a bit different. First, we’ll start from the beginning. This is how I initialize the DSound capture buffer.

bool CreateCaptureBuffer (int numChannels, int samplesPerSecond, int bitsPerSample)
{
    // Create Capture Device
    //
    // GUIDs:   DSDEVID_DefaultCapture      - System-wide default audio capture device.
    //          DSDEVID_DefaultVoiceCapture - Default voice capture device.
    if ( FAILED(DirectSoundCaptureCreate(&DSDEVID_DefaultVoiceCapture, &mDirectCapture, 0)) )
        return false;

    // Set direct capture buffer description
    memset(&mWF, 0, sizeof(WAVEFORMATEX));
    mWF.cbSize          = 0;
    mWF.nChannels       = numChannels;
    mWF.nSamplesPerSec  = samplesPerSecond;
    mWF.wBitsPerSample  = bitsPerSample;
    mWF.nBlockAlign     = numChannels * bitsPerSample / 8;
    mWF.nAvgBytesPerSec = mWF.nBlockAlign * samplesPerSecond;
    mWF.wFormatTag      = WAVE_FORMAT_PCM;

    // Set capture buffer description
    memset(&mDSDesc, 0, sizeof(DSCBUFFERDESC));
    mDSDesc.dwSize          = sizeof(DSCBUFFERDESC);
    mDSDesc.dwBufferBytes   = mWF.nAvgBytesPerSec;
    mDSDesc.lpwfxFormat     = &mWF;

    // Create capture buffer
    if ( FAILED(mDirectCapture->CreateCaptureBuffer(&mDSDesc, &mDSCaptureBuffer, 0)) )
        return false;

    return true;
}

In my test app, I used a mono channel, 44100 HZ 16 bit per sample capture buffer. As you can see from the code, I’m allocating a 1 second buffer (88200 bytes), which is an optimal comfort zone to deal with all situations. If my CPU starts to bottleneck, I can slow down the capture polling up to at most a 1 second delay.

Now, for the capture code:

int Read (char *buffer, int size)
{
    void *capturedData1 = 0;
    void *capturedData2 = 0;

    int capturedLength1 = 0;
    int capturedLength2 = 0;
    int capturePos = 0;
    int readPos = 0;
    int captureSize = 0;

    // Get the current capture & read positions
    if ( FAILED(mDSCaptureBuffer->GetCurrentPosition((LPDWORD)&capturePos, (LPDWORD)&readPos)) )
        return 0;

    // Calculate the size of the captured buffer
    captureSize = readPos - mCaptureOffset;
    // Cycle back?
    if ( captureSize < 0 )
        captureSize += mDSDesc.dwBufferBytes;
    // Too late, missed data
    if ( captureSize > size )
        captureSize = size;

    // Lock the buffer for copy
    if ( FAILED(mDSCaptureBuffer->Lock(mCaptureOffset, captureSize, &capturedData1, (LPDWORD)&capturedLength1, &capturedData2, (LPDWORD)&capturedLength2, 0)) )
        return 0;

    // Deal with first buffer
    if ( capturedData1 )
    {
        memcpy(buffer, capturedData1, capturedLength1);

        // Move the capture offset
        mCaptureOffset += capturedLength1;
        mCaptureOffset %= mDSDesc.dwBufferBytes;    // Circular buffer
    }

    // Deal with second buffer
    if ( capturedData2 )
    {
        memcpy(&buffer[capturedLength1], capturedData2, capturedLength2);

        // Move the capture offset
        mCaptureOffset += capturedLength2;
        mCaptureOffset %= mDSDesc.dwBufferBytes;    // Circular buffer
    }

    // Unlock the buffer
    mDSCaptureBuffer->Unlock(capturedData1, capturedLength1, capturedData2, capturedLength2);

    // Return the copied amount
    return (capturedLength1 + capturedLength2);

    return 0;
}

I don’t play around here. I get pointers to both buffers and copy them quickly. If the first one is set, then I copy over capturedLength1 bytes. If the second buffer is set, then I append capturedLength2 bytes. The total amount of captured bytes is equal to the sum of both buffers, which should be the same as captureSize. In my test app, bufferSize was set to 10 ms, or 882 bytes. Each time I called this method, 882 bytes were successfully read. If my CPU was not fast enough to do this, I would start to lose captured data. If that’s what you’re experiencing, check that your copy operations and other code is fast enough to poll at that interval.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 Jun 02, 2012 at 05:55

where does mCaptureOffset come from? i dont see where its initialized…

6837d514b487de395be51432d9cdd078
0
TheNut 179 Jun 02, 2012 at 10:28

It’s a member variable set to 0 when I call the StartCapture method (unlisted code), just before periodically calling the Read method.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 Jun 02, 2012 at 13:02

ah, thanks.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 Jun 03, 2012 at 15:59

hmm, for some reason the gap between the capture offset and the read cursor is never 882 bytes, its jumping around all over the place! what have I got wrong?

while(1)
{   
        Sleep(10);

        DWORD cp2,rpw2;
    capture->GetCurrentPosition(&cp2,&rpw2); 

        

        int wri=rpw2-capture_offset;

        if(wri<0)
        {
         wri+=capture_size;
        }


        int old_cwp=cwp;

        HRESULT hr=capture->Lock(capture_offset, wri, &clpvPtr1, &cdwBytes1, &clpvPtr2, &cdwBytes2, 0);
        if(hr==S_OK)
        {
            SHORT* us=(SHORT*)clpvPtr1;
            SHORT* us2=(SHORT*)clpvPtr2;

            if(us)
            {
                int j;
                for(j=0;j<cdwBytes1/2;j++)
                {
                int wab=us[j];
              capture_copy[cwp]=(short)wab;
                cwp++;
                    if(cwp==10000000) cwp=0;
                }
            }

            if(us2)
            {
                int j;
                for(j=0;j<cdwBytes2/2;j++)
                {
                int wab=us2[j];
                capture_copy[cwp]=(short)wab;
                cwp++;
                    if(cwp==10000000) cwp=0;
                }

            }
            capture->Unlock(clpvPtr1, cdwBytes1, clpvPtr2, cdwBytes2);
        
            capture_offset+=wri;
            capture_offset%=capture_size;

        }
}
Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 103 Jun 13, 2012 at 05:00

funnily enough capture_size wasnt setting, so i set it manually and it started working.

heres something i did stuffing around live.

http://soundcloud.co…r81/phaservoice

81.6 millisecond delay, still, unfortunately.

takes 3600 samples to get to the speaker.