There are several caveats for this discussion:
* This approach requires DirectShow. The DirectX SDK up to 9.0b included DirectShow but an extras package is now required for 9.0c and later.
* This approach is not exactly the most optimal. There are some memory copies and design aspects that are given up in order to keep the code simpler for this context.
* The code supplied uses some ATL wrapper templates to keep the code concise. CComPtr and CComQIPtr are reference counting aware and provide clean wrappers to COM interfaces. They release on destruction/scope-loss.
* The code supplied uses 1 G3D class for code clarity. This gem assumes texture generation knowledge is already available.
* This does not load WMV files properly. This is because the special WMV reader filter should be used. This can be added by doing a simple file extension test and replacing the source filter loading with another filter load.
I don't want to keep this very long especially since I left out cameras just for that (and I didn't have test equipment and time available this weekend). What I am going to show is how to load an arbitrary video file, make each frame convert to a useable 24-RGB format and then play the file. After the file is playing, I will show you an approach to grabbing the current frame.
Now, as we are dealing typically with compressed video format, there is an automatic performance issue with decompression. There is typically another performance hit when converting to 24-bit RGB before grabbing the frame. This is because most DirectShow filters like to pass data in the YUV or DX color space. Because of this, I would suggest a small video resolution or low playback rate. I did a basic test at simulated 10fps, 15fps and 25fps for playback. Any of those will do -- the key is to limit the number of times per second that a frame is converted into a texture. My tests were all in G3D so I don't want to add that here.
The class setup
This is a dummy class am creating for the gem out of my code. The benefit of a class is encapsulating the main interfaces. I did not test playing different files one after the other. Helper methods I provide are ConnectPins and FindPin.
#include <windows.h>
#include <atlcomcli.h>
#include <dshow.h>
#include <qedit.h>
class VideoWrapper {
CComPtr<ICaptureGraphBuilder2> graphBuilder;
CComPtr<IFilterGraph2> filterGraph;
CComPtr<ISampleGrabber> sampleGrabber;
bool videoInitialized;
int videoWidth;
int videoHeight;
long* frameBuffer;
long bufferSize;
/** G3D Texture object */
TextureRef videoTexture;
VideoWrapper {
// This must be callled before
// interfaces can be accessed
CoInitialize(NULL);
videoInitialized = false;
frameBuffer = NULL;
}
~VideoWrapper (
// Technically, uninit ever init
CoUninitialize();
}
bool loadVideoFile(const std::wstring& filename);
bool loadVideoCamera();
TextureRef grabFrameTexture();
void uninitVideo();
bool ConnectPins(IBaseFilter* outputFilter,
unsigned int outputNum,
IBaseFilter* inputFilter,
unsigned int inputNum);
void FindPin(IBaseFilter* baseFilter,
PIN_DIRECTION direction,
int pinNumber,
IPin** destPin);
};
Video Setup and Render
This takes a std::wstring to simplify the code since DirectShow takes Unicode strings. I originally wrote some code to convert to Unicode but didn't want to confuse the topic.
bool VideoWrapper::loadVideoFile(const std::wstring& filename) {
if (videoInitialized) {
uninitVideo();
}
// Create the main object that runs the graph
graphBuilder.CoCreateInstance(CLSID_CaptureGraphBuilder2);
filterGraph.CoCreateInstance(CLSID_FilterGraph);
graphBuilder->SetFiltergraph(filterGraph);
CComPtr<IBaseFilter> sourceFilter;
// This takes the absolute filename path and
// Loads the appropriate file reader and splitter
// Depending in the file type.
filterGraph->AddSourceFilter(filename.c_str(),
L"Video Source",
&sourceFilter);
// Create the Sample Grabber which we will use
// To take each frame for texture generation
CComPtr<IBaseFilter> grabberFilter;
grabberFilter.CoCreateInstance(CLSID_SampleGrabber);
grabberFilter->QueryInterface(IID_ISampleGrabber, reinterpret_cast<void**>(&sampleGrabber));
filterGraph->AddFilter(grabberFilter, L"Sample Grabber");
// We have to set the 24-bit RGB desire here
// So that the proper conversion filters
// Are added automatically.
AM_MEDIA_TYPE desiredType;
memset(&desiredType, 0, sizeof(desiredType));
desiredType.majortype = MEDIATYPE_Video;
desiredType.subtype = MEDIASUBTYPE_RGB24;
desiredType.formattype = FORMAT_VideoInfo;
sampleGrabber->SetMediaType(&desiredType);
sampleGrabber->SetBufferSamples(TRUE);
// Use pin connection methods instead of
// ICaptureGraphBuilder::RenderStream because of
// the SampleGrabber setting we're using.
if (!ConnectPins(sourceFilter, 0, grabberFilter, 0)) {
uninitVideo();
return false;
}
// A Null Renderer does not display the video
// But it allows the Sample Grabber to run
// And it will keep proper playback timing
// Unless specified otherwise.
CComPtr<IBaseFilter> nullRenderer;
nullRenderer.CoCreateInstance(CLSID_NullRenderer);
filterGraph->AddFilter(nullRenderer, L"Null Renderer");
if (!ConnectPins(grabberFilter, 0, nullRenderer, 0)) {
uninitVideo();
return false;
}
// Just a little trick so that we don't have to know
// The video resolution when calling this method.
bool mediaConnected = false;
AM_MEDIA_TYPE connectedType;
if (SUCCEEDED(sampleGrabber->GetConnectedMediaType(&connectedType))) {
if (connectedType.formattype == FORMAT_VideoInfo) {
VIDEOINFOHEADER* infoHeader = (VIDEOINFOHEADER*)connectedType.pbFormat;
videoWidth = infoHeader->bmiHeader.biWidth;
videoHeight = infoHeader->bmiHeader.biHeight;
mediaConnected = true;
}
CoTaskMemFree(connectedType.pbFormat);
}
if (!mediaConnected) {
uninitVideo();
return false;
}
// Tell the whole graph to start sending video
// Apart from making sure the source filter can load
// This is the only failure point we care about unless
// You need to do more extensive development and debugging.
CComQIPtr<IMediaControl> mediaControl(filterGraph);
if (SUCCEEDED(mediaControl->Run())) {
videoInitialized = true;
return true;
} else {
uninitVideo();
return false;
}
}
/** For a later time but probably faster displays. */
bool VideoWrapper::loadVideoCamera() {
return false;
}
TextureRef VideoWrapper::grabFrameTexture() {
if (videoInitialized) {
// Only need to do this once
if (!frameBuffer) {
// The Sample Grabber requires an arbitrary buffer
// That we only know at runtime.
// (width * height * 3) bytes will not work.
sampleGrabber->GetCurrentBuffer(&bufferSize, NULL);
frameBuffer = new long[bufferSize];
}
sampleGrabber->GetCurrentBuffer(&bufferSize, (long*)frameBuffer);
// G3D Texture creation for code simplification, the format is obvious.
return Texture::fromMemory(
"Video Frame",
(const uint8*)frameBuffer,
TextureFormat::RGB8,
videoWidth,
videoHeight,
TextureFormat::AUTO,
Texture::TILE,
Texture::BILINEAR_NO_MIPMAP);
}
return NULL;
}
void VideoWrapper::uninitVideo() {
videoInitialized = false;
if (videoInitialized) {
sampleGrabber.Release();
CComQIPtr<IMediaControl> mediaControl(filterGraph);
mediaControl->Stop();
filterGraph.Release();
graphBuilder.Release();
}
delete[] frameBuffer;
frameBuffer = NULL;
}
bool VideoWrapper::ConnectPins(IBaseFilter* outputFilter,
unsigned int outputNum,
IBaseFilter* inputFilter,
unsigned int inputNum) {
CComPtr<IPin> inputPin;
CComPtr<IPin> outputPin;
if (!outputFilter || !inputFilter) {
return false;
}
FindPin(outputFilter, PINDIR_OUTPUT, outputNum, &outputPin);
FindPin(inputFilter, PINDIR_INPUT, inputNum, &inputPin);
if (inputPin && outputPin) {
return SUCCEEDED(filterGraph->Connect(outputPin, inputPin));
} else {
return false;
}
}
void VideoWrapper::FindPin(IBaseFilter* baseFilter,
PIN_DIRECTION direction,
int pinNumber,
IPin** destPin) {
CComPtr<IEnumPins> enumPins;
*destPin = NULL;
if (SUCCEEDED(baseFilter->EnumPins(&enumPins))) {
ULONG numFound;
IPin* tmpPin;
while (SUCCEEDED(enumPins->Next(1, &tmpPin, &numFound))) {
PIN_DIRECTION pinDirection;
tmpPin->QueryDirection(&pinDirection);
if (pinDirection == direction) {
if (pinNumber == 0) {
// Return the pin's interface
*destPin = tmpPin;
break;
}
pinNumber--;
}
tmpPin->Release();
}
}
}
Libraries and Includes needed
strmiids.lib - From DirectX's lib directory.
windows.h - From the PlatformSDK or Visual C++
atlcomcli.h - (Part of ATL) From the Platform SDK or Visual C++
dshow.h - From DirectX's include directory
qedit.h - From DirectX's include directory
Follow up
This is only for displaying existing video on a texture. This allows for perspective projection of the video instead of always just displaying 2D. If you want to just display the video in a 2D box over a window, this can be done in a much more efficient manner without textures. If you want to actually take a texture or frame buffer and convert it into a video file then that requires much more extensive COM and DirectShow Filter creation that is out of the scope of this gem entirely.
I look foward to any questions and more development.
Corey Taylor
G3D 6.07 3D Engine












