OpenCL运行测试
由于双目的分辨率会非常大,因此特来测试下OpenCL的运算性能(奈何手中无英伟达)
环境:MacOS x86_64 Darwin 17.7.0
CPU:Intel Core i5-5350U
GPU:Intel HD Graphics 6000
OpenCV:OpenCV 4.2
编译器:clang++
以下是选用一张4480x6720和一张640x480图像做测试的结果,原始数据输出:
测试结果(仅供参考)
像素:4480x6720,循环次数:100次
Size:4480x6720
OpenCL time consume = 85.6954ms
Size:4480x6720
CPU time consume = 120.533ms
Size:4480 x 6720
OpenCL loops time consume = 1050.15 ms
Size:4480x6720
CPU loop time consume = 1058.4ms
-------------Process Complete------------像素:4480x6720,循环次数:100次
Size:4480x6720
OpenCL time consume = 86.6929 ms
Size:4480x6720
CPU time consume = 118.128 ms
Size:4480 x 6720
OpenCL loops time consume = 1055.27 ms
Size:4480x6720
CPU loops time consume = 1065.48 ms
-------------Process Complete------------像素4480x6720,循环次数:1000
Size:4480x6720
OpenCL time consume = 86.6043 ms
Size:4480x6720
CPU time consume = 119.366 ms
Size:4480 x 6720
OpenCL loops time consume = 97815.7 ms
Size:4480x6720
CPU loops time consume = 105973 ms
-------------Process Complete------------像素:640x480,循环次数:100次
Size:640x480
OpenCL time consume = 3.01333 ms
Size:640x480
CPU time consume = 1.55183 ms
Size:640 x 480
OpenCL loops time consume = 109.044 ms
Size:640x480
CPU loops time consume = 84.6049 ms
-------------Process Complete------------像素:640x480,循环次数:100次
Size:640x480
OpenCL time consume = 3.2602 ms
Size:640x480
CPU time consume = 1.29072 ms
Size:640 x 480
OpenCL loops time consume = 107.968 ms
Size:640x480
CPU loops time consume = 86.4707 ms
-------------Process Complete------------像素:640x480,1000次
Size:640x480
OpenCL time consume = 3.23095 ms
Size:640x480
CPU time consume = 1.17594 ms
Size:640 x 480
OpenCL loops time consume = 1164.68 ms
Size:640x480
CPU loops time consume = 940.939 ms
-------------Process Complete------------像素:640x480,10000次
Size:640x480
OpenCL time consume = 3.64011 ms
Size:640x480
CPU time consume = 1.2437 ms
Size:640 x 480
OpenCL loops time consume = 10745.1 ms
Size:640x480
CPU loops time consume = 8566.75 ms
-------------Process Complete------------
表格总结
表1:
分辨率 | CPU处理1次 | OpenCL处理1次 | CPU循环100次 | OpenCL循环100次 |
---|---|---|---|---|
640x480 | 1.55 ms | 3.01 ms | 84.60 ms | 109.04 ms |
640x480 | 1.29 ms | 3.26 ms | 86.47 ms | 107.96 ms |
4480x6720 | 120.53 ms | 85.69 ms | 1058.4 ms | 1050.15 ms |
4480x6720 | 118.12 ms | 86.69 ms | 1065.48 ms | 1055.27 ms |
表2:
分辨率 | 循环次数 | CPU处理1次 | OpenCL处理1次 | CPU循环 | OpenCL循环 |
---|---|---|---|---|---|
640x480 | 100次 | 1.55 ms | 3.01 ms | 84.60 ms | 109.04 ms |
640x480 | 100次 | 1.29 ms | 3.26 ms | 86.47 ms | 107.96 ms |
640x480 | 1000次 | 1.17 ms | 3.23 ms | 940.93 ms | 1164.68 ms |
640x480 | 10000次 | 1.24 ms | 3.64 ms | 8566.75 ms | 10745.10 ms |
4480x6720 | 100次 | 120.53 ms | 85.69 ms | 1058.4 ms | 1050.15 ms |
4480x6720 | 100次 | 118.12 ms | 86.69 ms | 1065.48 ms | 1055.27 ms |
4480x6720 | 1000次 | 119.36 ms | 86.60 ms | 1059.73 s | 978.15 s |
图表分析
为此我专门做了如下图表:
因为这个数据差距有些大,不好直观看出差距,那么我们不妨采用log10
为底的坐标刻度:
初步分析
- 在分辨率较小的情况下,使用CPU处理会快于GPU(OpenCL),分辨率较大则GPU占优如图表中显示。
- 在分辨率大(>1080P)而且循环次数多(>1000)的情况下,使用GPU(OpenCL)明显会快出CPU,参见表2最后一组数据,1000次循环整整快出了81.58s。
当然,由于平台不同,这个结论不一定具有普适性
结尾放上测试代码,简单的canny
边缘检测
edge_test.cpp
#include <opencv2/opencv.hpp>
using namespace cv;
void opencl_process(std::string &filename); //处理一张图(OpenCL)
void cpu_process(std::string &filename); //处理一张图(CPU)
void loops_opencl(std::string &filename,int ×); //循环处理(OpenCL)
void loops_cpu(std::string &filename,int ×); //循环处理(CPU)
int main(int argc, char** argv)
{
std::string filename = 文件名;
int times = 循环次数;
opencl_process(filename);
cpu_process(filename);
loops_opencl(filename,times);
loops_cpu(filename,times);
std::cout <<"-------------Process Complete------------"<<std::endl;
return 0;
}
void opencl_process(std::string &filename){
double start = (double)getTickCount();
UMat img, gray;
// 复制,从Mat->UMat
imread(filename, IMREAD_COLOR).copyTo(img);
cvtColor(img, gray, COLOR_BGR2GRAY);
GaussianBlur(gray, gray,Size(7, 7), 1.5);
Canny(gray, gray, 0, 50);
double time_consume = ((double)getTickCount() - start) / getTickFrequency();
std::cout << "Size:" << gray.cols << "x" << gray.rows << std::endl;
std::cout << "OpenCL time consume = " << time_consume * 100<< " ms" << std::endl;
}
void cpu_process(std::string &filename){
double start = (double)getTickCount();
Mat img, gray;
imread(filename, IMREAD_COLOR).copyTo(img);
cvtColor(img, gray, COLOR_BGR2GRAY);
GaussianBlur(gray, gray,Size(7, 7), 1.5);
Canny(gray, gray, 0, 50);
double time_consume = ((double)getTickCount() - start) / getTickFrequency();
std::cout << "Size:" << gray.cols << "x" << gray.rows << std::endl;
std::cout << "CPU time consume = " << time_consume * 100<< " ms" << std::endl;
}
void loops_opencl(std::string &filename,int ×){
double start = (double)getTickCount();
UMat img, gray;
for(int i=0;i<=times;i++){
imread(filename, IMREAD_COLOR).copyTo(img);
cvtColor(img, gray, COLOR_BGR2GRAY);
GaussianBlur(gray, gray,Size(7, 7), 1.5);
Canny(gray, gray, 0, 50);
}
double time_consume = ((double)getTickCount() - start) / getTickFrequency();
std::cout << "Size:" << gray.cols << " x " << gray.rows << std::endl;
std::cout << "OpenCL loops time consume = " << time_consume * 100<< " ms" << std::endl;
}
void loops_cpu(std::string &filename,int ×){
double start = (double)getTickCount();
Mat img, gray;
for (int i =0; i < times; i++){
imread(filename, IMREAD_COLOR).copyTo(img);
cvtColor(img, gray, COLOR_BGR2GRAY);
GaussianBlur(gray, gray,Size(7, 7), 1.5);
Canny(gray, gray, 0, 50);
}
double time_consume = ((double)getTickCount() - start) / getTickFrequency();
std::cout << "Size:" << gray.cols << "x" << gray.rows << std::endl;
std::cout << "CPU loops time consume = " << time_consume * 100<< " ms" << std::endl;
}
更改文件名和循环次数即可。
编译命令参考:
clang++ -std=c++11 edge_test.cpp -o edge_test `pkg-config --cflags --libs opencv4`
注意:opencv 4,开启c++11编译选项。