# OpenCL woes

Okay right now it's 2 am here - and I've met strange issue (note that it can actually not be issue, but I'm sitting on this for 2 hours and I can't solve it)...

I standardly create kernel and pass values like:

clSetKernelArg(cl_kernel, 0, sizeof(cl_mem), (void*)&rt_kdtree_mem_c);
clSetKernelArg(cl_kernel, 1, sizeof(cl_mem), (void*)&rt_kdtree_mem_a);
clSetKernelArg(cl_kernel, 2, sizeof(cl_mem), (void*)&rt_kdtree_mem_B);
clSetKernelArg(cl_kernel, 3, sizeof(cl_mem), (void*)&rt_tree_bounds);
clSetKernelArg(cl_kernel, 4, sizeof(int), (void*)&rt_kdtree_items);


And my kernel looks like (just for testing):


#pragma OPENCL EXTENSION cl_amd_printf : enable

__kernel void main(__global float* c, __global float* a, __global float* b, __global float* bounds, const int num)
{
const int idx = get_global_id(0);

printf("Idx %d, Bounds: %f %f %f\n", idx, bounds[0], bounds[1], bounds[2]);

if(idx < num)
{
c[idx] = a[idx] + b[idx];
}
}


What I really can't get is, why my bounds are passed like {-125, 0, 0, .... 0}, and on the application side it's actually {-125, -5, -125, 1, ... 1}. E.g. I only get first value passed.

Data are passed in the standard way...

clEnqueueWriteBuffer(cl_queue, rt_tree_bounds, CL_TRUE, 0, sizeof(float) * 8, data, 0, NULL, &cl_event);
clReleaseEvent(cl_event);


And of course I call clFinish before I execute the kernel.

I still can't get it after two hours, on the other hand I'm sure there is some childish bug in here... If anyone can spot it, I'd be very glad.
Okay I'll answer myself for this childish bug...

I was still initializing device with CL_DEVICE_TYPE_GPU along with using cl_amd_printf extension in the code. Of course I got wrong values out - but actually when I read back those values from device using standard clEnqueueReadBuffer I got right values. So now I'm on CL_DEVICE_TYPE_CPU and everything works alright.

And for next time - the bug is NEVER on the place where it occurs. (with a bit of humor of course)
