# OpenCL woes

### #1Vilem Otte

Valued Member

• Members
• 345 posts

Posted 14 September 2012 - 12:14 AM

Okay right now it's 2 am here - and I've met strange issue (note that it can actually not be issue, but I'm sitting on this for 2 hours and I can't solve it)...

I standardly create kernel and pass values like:

clSetKernelArg(cl_kernel, 0, sizeof(cl_mem), (void*)&rt_kdtree_mem_c);
clSetKernelArg(cl_kernel, 1, sizeof(cl_mem), (void*)&rt_kdtree_mem_a);
clSetKernelArg(cl_kernel, 2, sizeof(cl_mem), (void*)&rt_kdtree_mem_B);
clSetKernelArg(cl_kernel, 3, sizeof(cl_mem), (void*)&rt_tree_bounds);
clSetKernelArg(cl_kernel, 4, sizeof(int), (void*)&rt_kdtree_items);


And my kernel looks like (just for testing):


#pragma OPENCL EXTENSION cl_amd_printf : enable

__kernel void main(__global float* c, __global float* a, __global float* b, __global float* bounds, const int num)
{
const int idx = get_global_id(0);

printf("Idx %d, Bounds: %f %f %f\n", idx, bounds[0], bounds[1], bounds[2]);

if(idx < num)
{
c[idx] = a[idx] + b[idx];
}
}


What I really can't get is, why my bounds are passed like {-125, 0, 0, .... 0}, and on the application side it's actually {-125, -5, -125, 1, ... 1}. E.g. I only get first value passed.

Data are passed in the standard way...

clEnqueueWriteBuffer(cl_queue, rt_tree_bounds, CL_TRUE, 0, sizeof(float) * 8, data, 0, NULL, &cl_event);
clReleaseEvent(cl_event);


And of course I call clFinish before I execute the kernel.

I still can't get it after two hours, on the other hand I'm sure there is some childish bug in here... If anyone can spot it, I'd be very glad.
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

### #2Vilem Otte

Valued Member

• Members
• 345 posts

Posted 14 September 2012 - 12:24 AM

Okay I'll answer myself for this childish bug...

I was still initializing device with CL_DEVICE_TYPE_GPU along with using cl_amd_printf extension in the code. Of course I got wrong values out - but actually when I read back those values from device using standard clEnqueueReadBuffer I got right values. So now I'm on CL_DEVICE_TYPE_CPU and everything works alright.

And for next time - the bug is NEVER on the place where it occurs. (with a bit of humor of course)
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

#### 1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users