Jump to content


OpenCL woes


1 reply to this topic

#1 Vilem Otte

    Valued Member

  • Members
  • PipPipPipPip
  • 345 posts

Posted 14 September 2012 - 12:14 AM

Okay right now it's 2 am here - and I've met strange issue (note that it can actually not be issue, but I'm sitting on this for 2 hours and I can't solve it)...

I standardly create kernel and pass values like:

clSetKernelArg(cl_kernel, 0, sizeof(cl_mem), (void*)&rt_kdtree_mem_c);
clSetKernelArg(cl_kernel, 1, sizeof(cl_mem), (void*)&rt_kdtree_mem_a);
clSetKernelArg(cl_kernel, 2, sizeof(cl_mem), (void*)&rt_kdtree_mem_B);
clSetKernelArg(cl_kernel, 3, sizeof(cl_mem), (void*)&rt_tree_bounds);
clSetKernelArg(cl_kernel, 4, sizeof(int), (void*)&rt_kdtree_items);

And my kernel looks like (just for testing):


#pragma OPENCL EXTENSION cl_amd_printf : enable

__kernel void main(__global float* c, __global float* a, __global float* b, __global float* bounds, const int num)
{
  const int idx = get_global_id(0);

  printf("Idx %d, Bounds: %f %f %f\n", idx, bounds[0], bounds[1], bounds[2]);

  if(idx < num)
  {
    c[idx] = a[idx] + b[idx];
  }
}

What I really can't get is, why my bounds are passed like {-125, 0, 0, .... 0}, and on the application side it's actually {-125, -5, -125, 1, ... 1}. E.g. I only get first value passed.

Data are passed in the standard way...

clEnqueueWriteBuffer(cl_queue, rt_tree_bounds, CL_TRUE, 0, sizeof(float) * 8, data, 0, NULL, &cl_event);
clReleaseEvent(cl_event);

And of course I call clFinish before I execute the kernel.

I still can't get it after two hours, on the other hand I'm sure there is some childish bug in here... If anyone can spot it, I'd be very glad. :)
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

#2 Vilem Otte

    Valued Member

  • Members
  • PipPipPipPip
  • 345 posts

Posted 14 September 2012 - 12:24 AM

Okay I'll answer myself for this childish bug...

I was still initializing device with CL_DEVICE_TYPE_GPU along with using cl_amd_printf extension in the code. Of course I got wrong values out - but actually when I read back those values from device using standard clEnqueueReadBuffer I got right values. So now I'm on CL_DEVICE_TYPE_CPU and everything works alright.

And for next time - the bug is NEVER on the place where it occurs. :D (with a bit of humor of course)
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users