NDRange & single workitem
http://downloads.ti.com/mctools/esd/docs/opencl/execution/kernels-workgroups-workitems.html
如果内核使用单个工作项来执行——比如运行在一个CPU上,那么可以用下列函数:
clEnqueueTask (cl_command_queue command_queue, cl_kernel kernel, cl_uint num_events_in_wait_list, const cl_event *event_wait_list, cl_event *event)
OpenCL入门介绍
https://blog.csdn.net/u012361418/article/details/46475885
内核执行
一个执行内核的命令,必须排队到命令队列
clEnqueueNDRangeKernel()
数据并行执行模型
描述内核执行的索引空间
需要NDRandge()维度和工作组大小的信息
clEnqueueTask()
任务并行执行模型(多队列任务)
内核在单工作项上执行
clEnqueueNativeKernel()
任务并行执行模型
执行一个未编译的本地C/C++函数,使用OpenCL编译器
此模式不使用内核对象,因此参数必须被传递
OpenCL异构并行计算编程笔记(2):命令队列与内存对象
http://blog.csdn.net/catalyst_zx/article/details/52818557
cl_int ret; //用于保存函数返回值
cl_context_properties context_props[] = { CL_CONTEXT_PLATFORM, (cl_context_properties)platform[1], 0 }; //用于设置命令队列属性
size_t ext_size; //用于保存命令队列信息大小
char *ext_data; //用于保存命令队列信息
cl_platform_info command_queue_info = CL_QUEUE_COMTEXT; //设置所要查询的信息
//建立命令队列
cl_context command_queue = clCreateContext(context, device, context_props, &ret);
//获取命令队列信息大小
ret = clGetCommandQueueInfo(command_queue, command_queue_info, context_props, NULL, ext_size)
//为ext_data分配空间大小
ext_data = new char[ext_size];
//获取命令队列信息
ret = clGetCommandQueueInfo(command_queue, command_queue_info, ext_size, ext_data, NULL);
std::cout << ext_data << std::endl; //输出命令队列信息
get_global_size
https://www.cnblogs.com/biglucky/p/3755189.html
uint get_work_dim() : 返回线程调度的维度数。
uint get_global_size(uint dimension) : 返回在所请求维度上work_item的总数。
uint get_global_id(uint dimension) : 返回在所请求的维度上当前work_item在全局空间中的索引。
uint get_local_size(uint dimension) : 返回在所请求的维度上work-group的大小。
uint get_local_id(uint dimension) : 返回在所请求的维度上,当前work_item在work_group中的索引。
uint get_number_groups(uint dimension) : 返回在所请求维度上work-group的数目,这个值等于get_global_size 除以 get_local_size。
uint get_group_id(uint dimension) : 返回在所请求的维度上当前wrok_group在全局空间中的索引。
Re: basic question regarding get_global_id
get_global_id returns the number for the current thread. The parameter is just the dimension of the array of threads. When you enqueue a kernel, one of the parameters is an int array global_work_size. If global_work_size is an int[2] (2 dimensional array of threads) then each thread will have a 2 dimensional identifier. Lets say global_work_size[0] = 3 and global_work_size[1] = 3 then there will be 3 * 3 = 9 threads in total in a grid something like this:
-----------------
| 0,0 | 0,1 | 0,2 |
-----------------
| 1,0 | 1,1 | 1,2 |
-----------------
| 2,0 | 2,1 | 2,2 |
-----------------
So the thread in the top left corner will have get_global_id(0) = 0 and get_global_id(1) = 0; The one just below it will have get_global_id(0) = 1 and get_global_id(1) = 0.
Edit: have a look at this link for another picture that might help: http://www.khronos.org/message_board...hp?f=28&t=5375.
-----------------
| global_id(0,0) | global_id(0,1) | global_id(0,2) |
-----------------
| global_id(1,0) | global_id(1,1) | global_id(1,2) |
-----------------
| global_id(2,0) | global_id(2,1) | global_id(2,2) |
-----------------
For Thread(1,2):
get_global_id = (1,2) ???
get_global_id(0) = 1
get_global_id(1) = 2