GPGPU labor VIII. OpenCL bevezetés. Kezdeti teendők Tantárgy honlapja, OpenCL bevezetés II. A labor kiindulási alapjának letöltése (lab8_base.zip), kitömörítés.

Slides:



Advertisements
Similar presentations
Lecture Computer Science I - Martin Hardwick Strings #include using namespace std; int main () { string word; cout
Advertisements

Copyright © 2003 Pearson Education, Inc. Slide 1.
Dynamic Allocation and Linked Lists. Dynamic memory allocation in C C uses the functions malloc() and free() to implement dynamic allocation. malloc is.
LINKED LIST, STACKS AND QUEUES Saras M Srivastava, PGT – Comp. Sc. Kendriya Vidyalaya TengaValley.
GPGPU labor IV. Scatter és gather. Kezdeti teendők Tantárgy honlapja, Scatter és gather A labor kiindulási alapjának letöltése (lab4_base.zip), kitömörítés.
GPGPU labor XIII. Folyadék szimuláció. Kezdeti teendők Tantárgy honlapja, Folyadék szimuláció A labor kiindulási alapjának letöltése (lab13_base.zip),
GPGPU labor VI. Rekurzív algoritmusok labirintus.
GPGPU labor XI. Monte Carlo szimuláció. Kezdeti teendők Tantárgy honlapja, Monte Carlo szimuláció A labor kiindulási alapjának letöltése (lab11_base.zip),
GPGPU labor IV. Scatter és gather. Kezdeti teendők Tantárgy honlapja, Scatter és gather A labor kiindulási alapjának letöltése (lab4_base.zip), kitömörítés.
GPGPU Labor 8.. CLWrapper OpenCL Framework primitívek – ClWrapper(cl_device_type _device_type); – cl_device_id device_id() – cl_context context() – cl_command_queue.
Computer Science 1620 Math Library. Remember this program? suppose that I invest $25000 into a mutual fund that returns 8% per year. Write a program to.
Write a function to calculate the cubic function: y = 4x 3 + 2x 2 –5x – 4 The function should return y for any given value of x. Question One #include.
Operator Overloading. Introduction Operator overloading –Enabling C++’s operators to work with class objects –Using traditional operators with user-defined.
 2003 Prentice Hall, Inc. All rights reserved. 1 Chapter 8 - Operator Overloading Outline 8.1 Introduction 8.2 Fundamentals of Operator Overloading 8.3.
A c i d s & B a s e s. A c i d - B a s e T h e o r i e s A r r h e n i u s B r o n s t e d - L o w r y L e w i s A r r h e n i u s B r o n s t e d - L.
Copyright © 2008 Pearson Addison-Wesley. All rights reserved. Chapter 5 Functions for All Subtasks.
File and I/O system calls int open(const char* path, int flags, mode_t modes) int creat(const char *path, mode_t mode) ssize_t read(int fd, void *buf,
第三次小考. #include using namespace std; int aaa(int *ib,int a1,int a2) { int u,v; int m=(a1+a2)/2; if(a1==a2)return ib[a1]; u=aaa(ib,a1,m); cout
String in C++. String Series of characters enclosed in double quotes.“Philadelphia University” String can be array of characters ends with null character.
1 Class Vehicle #include #define N 10../.. 2 Class Vehicle class vehicle { public: float speed; char colour[N+1]; char make[N+1];
Informática II Prof. Dr. Gustavo Patiño MJ
CS 202 Computer Science II Lab Fall 2009 September 17.
1 Lecture 15 Chapter 6 Looping Dale/Weems/Headington.
CS 202 Computer Science II Lab Fall 2009 September 10.
1 Lecture 16 Chapter 6 Looping Dale/Weems/Headington.
DT211/3 Internet Application Development Web Servers.
Dynamic Allocation and Linked Lists. Dynamic memory allocation in C C uses the functions malloc() and free() to implement dynamic allocation. malloc is.
Week 9 Part 2 Kyle Dewey. Overview Announcement More with structs and memory Assertions Exam #2 Course review.
Network Programming using NetLink Sockets C++ Library
1 INF160 IS Development Environments AUBG, COS dept Lecture 06 Title: Dev Env: Code::Blocks (Extract from Syllabus) Reference:
For more info visit at For more info visit at
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 102 Computer Programming II (Lab:
ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson,
Income Inequality Overview How has the USA Economy performed Since World War II?
Dynamic memory allocation and Pointers Lecture 4.
IBM Software Group | Rational Software © 2007 IBM Corporation Access to System i and System z data Mark Evans
Writing Prompt: What Do I like about Labor Day? __________________________________________ __________________________________________ __________________________________________.
Lesson xx Why use functions Program that needs a function Function header Function body Program rewritten using a function.
Current Assignments Project 3 has been posted, due next Tuesday. Write a contact manager. Homework 6 will be posted this afternoon and will be due Friday.
基 督 再 來 (一). 經文: 1 你們心裡不要憂愁;你們信神,也當信我。 2 在我父的家裡有許多住處;若是沒有,我就早 已告訴你們了。我去原是為你們預備地去 。 3 我 若去為你們預備了地方,就必再來接你們到我那 裡去,我在 那裡,叫你們也在那裡, ] ( 約 14 : 1-3)
Introduction to CUDA Programming Introduction to OpenCL Andreas Moshovos Spring 2011 Based on:
Heterogeneous Computing using openCL lecture 2 F21DP Distributed and Parallel Technology Sven-Bodo Scholz.
Development Environment Setup
Introduction to Programming
Command Line Arguments
Traversing a Linked List
Pointers Psst… over there.
Managing Human Resources and Labor Relations.
Dynamic Memory Allocation Reference Variables
Pointers Psst… over there.
Programming -2 برمجة -2 المحاضرة-5 Lecture-5.
Lab 1 Introduction to C++.
Null-Terminated Character Arrays
Learning Objectives String Class.
Screen output // Definition and use of variables
Find Out How Much Your House Is Worth In Today’s Market.
Слайд-дәріс Қарағанды мемлекеттік техникалық университеті
.. -"""--..J '. / /I/I =---=-- -, _ --, _ = :;:.
GPGPU-Sim Tutorial (MICRO 2012) 3: Setup and Run
Lab 1 Introduction to C++.
II //II // \ Others Q.
I1I1 a 1·1,.,.,,I.,,I · I 1··n I J,-·
Creating the First Program
Engine Part ID Part 1.
Find Out How Much Your House Is Worth In Today’s Market.
Engine Part ID Part 2.
Engine Part ID Part 2.
. '. '. I;.,, - - "!' - -·-·,Ii '.....,,......, -,
Introduction to Algorithms and Programming COMP151
Odds and Ends.
Presentation transcript:

GPGPU labor VIII. OpenCL bevezetés

Kezdeti teendők Tantárgy honlapja, OpenCL bevezetés II. A labor kiindulási alapjának letöltése (lab8_base.zip), kitömörítés a D:\GPGPU\ könyvtárba D:\GPGPU\labs\lab8\lab8_opencl\lab8_opencl.sln indítása Project tulajdonságai – Configuration Properties – Debugging – Working Directory = $(ProjectDir)\..\..\bin

Platform // OpenCL platform cl_platform_id platform; char* getPlatformInfo(cl_platform_id platform, cl_platform_info paramName){ size_t infoSize = 0; CL_SAFE_CALL( clGetPlatformInfo(platform, paramName, 0, NULL, &infoSize) ); char* info = (char*)malloc(infoSize); CL_SAFE_CALL( clGetPlatformInfo(platform, paramName, infoSize, info, NULL) ); return info; } cl_platform_id createPlatform(){ cl_platform_id platform; CL_SAFE_CALL( clGetPlatformIDs(1, &platform, NULL)); std::cout << getPlatformInfo(platform, CL_PLATFORM_VERSION) << std::endl; return platform; }

OpenCL eszközök // OpenCL devices of the platform cl_device_id device_id; void* getDeviceInfo(cl_device_id device_id, cl_device_info paramName){ size_t infoSize = 0; CL_SAFE_CALL( clGetDeviceInfo(device_id, paramName, 0, NULL, &infoSize) ); char* info = (char*)malloc(infoSize); CL_SAFE_CALL( clGetDeviceInfo(device_id, paramName, infoSize, info, NULL) ); return info; } cl_device_id createDevice(cl_platform_id platform, cl_device_type type){ cl_device_id device_id; CL_SAFE_CALL( clGetDeviceIDs(platform, type, 1, &device_id, NULL) ); cl_uint* max_compute_units = (cl_uint*)getDeviceInfo(device_id, CL_DEVICE_MAX_COMPUTE_UNITS); std::cout << "Max compute units: " << *max_compute_units << std::endl; return device_id; }

Kontextus // OpenCL context cl_context context; cl_context createContext(cl_device_id device_id){ cl_context context = 0; context = clCreateContext(0, 1, &device_id, NULL, NULL, NULL); if(!context){ std::cerr << "Context creation failed!\n"; exit(EXIT_FAILURE); } return context; }

Parancs sor // OpenCL command queue cl_command_queue commands; cl_command_queue createCommandQueue(cl_context context, cl_device_id device){ cl_command_queue command_queue = 0; command_queue = clCreateCommandQueue(context, device_id, 0, NULL); if(!command_queue){ std::cerr << "Command queue creation failed!\n"; } return command_queue; }

OpenCL program // OpenCL program cl_program program; bool fileToString(const char* path, char*& out, int& len) { std::ifstream file(path, std::ios::ate | std::ios::binary); if(!file.is_open()) { return false; } len = file.tellg(); out = new char[ len+1 ]; file.seekg (0, std::ios::beg); file.read(out, len); file.close(); out[len] = 0; return true; }

OpenCL program cl_program createProgram(cl_context context, cl_device_id device_id, const char* fileName){ char* programSource = NULL; int len = 0; int errorFlag = -1; if(!fileToString(fileName, programSource, len)){ std::cerr << "Error loading program: " << fileName << std::endl; exit(EXIT_FAILURE); } cl_program program = 0; program = clCreateProgramWithSource(context, 1, (const char**)&programSource, NULL, NULL); if (!program) { std::cerr << "Error: Failed to create compute program!" << std::endl; exit(EXIT_FAILURE); } cl_int err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL); if (err != CL_SUCCESS) { size_t len; char buffer[2048]; std::cerr << "Error: Failed to build program executable!" << std::endl; clGetProgramBuildInfo(program, device_id, CL_PROGRAM_BUILD_LOG, sizeof(buffer), buffer, &len); std::cerr << buffer << std::endl; exit(1); } return program; }

OpenCL kernel // OpenCL kernel cl_kernel createKernel(cl_program program, const char* kernelName){ cl_kernel kernel; cl_int err; kernel = clCreateKernel(program, kernelName, &err); if (!kernel || err != CL_SUCCESS) { std::cerr << "Error: Failed to create compute kernel!" << std::endl; exit(1); } return kernel; }

main() // OpenCL init platform = createPlatform(); device_id = createDevice(platform, CL_DEVICE_TYPE_GPU); context = createContext(device_id); commands = createCommandQueue(context, device_id); program = createProgram(context, device_id, "programs.cl"); // OpenCL processing // OpenCL cleanup clReleaseProgram(program); clReleaseCommandQueue(commands); clReleaseContext(context); return 0;

Globális címzés // simple global address void globalAddress(){ cl_kernel globalAddressKernel = createKernel(program, "globalAddress"); const int data_size = 1024; float* data = (float*)malloc(sizeof(float)*data_size); cl_mem clData = clCreateBuffer(context, CL_MEM_WRITE_ONLY, sizeof(float) * data_size, NULL, NULL); CL_SAFE_CALL( clSetKernelArg(globalAddressKernel, 0, sizeof(cl_mem), &clData) ); size_t workgroupSize = 0; CL_SAFE_CALL( clGetKernelWorkGroupInfo(globalAddressKernel, device_id, CL_KERNEL_WORK_GROUP_SIZE, sizeof(workgroupSize), &workgroupSize, NULL) ); size_t workSize = data_size; CL_SAFE_CALL( clEnqueueNDRangeKernel(commands, globalAddressKernel, 1, NULL, &workSize, &workgroupSize, 0, NULL, NULL) ); clFinish(commands); CL_SAFE_CALL( clEnqueueReadBuffer(commands, clData, CL_TRUE, 0, sizeof(float) * data_size, data, 0, NULL, NULL) ); FILE* outFile = fopen("globalAddress.txt", "w"); for(int i = 0; i < data_size; ++i){ fprintf(outFile, "%f\n", data[i]); } fclose(outFile); clReleaseKernel(globalAddressKernel); free(data); }

Globális címzés (programs.cl) __kernel void globalAddress(__global float* data){ int id = get_global_id(0); data[id] = id; }

Globális címzés

Lokális címzés // local address void localAddress(){ cl_kernel localAddressKernel = createKernel(program, "localAddress"); const int data_size = 1024; float* data = (float*)malloc(sizeof(float)*data_size); cl_mem clData = clCreateBuffer(context, CL_MEM_WRITE_ONLY, sizeof(float) * data_size, NULL, NULL); CL_SAFE_CALL( clSetKernelArg(localAddressKernel, 0, sizeof(cl_mem), &clData) ); size_t workgroupSize = 0; CL_SAFE_CALL ( clGetKernelWorkGroupInfo(localAddressKernel, device_id, CL_KERNEL_WORK_GROUP_SIZE, sizeof(workgroupSize), &workgroupSize, NULL) ); workgroupSize = workgroupSize / 4; size_t workSize = data_size; CL_SAFE_CALL( clEnqueueNDRangeKernel(commands, localAddressKernel, 1, NULL, &workSize, &workgroupSize, 0, NULL, NULL) ); clFinish(commands); CL_SAFE_CALL( clEnqueueReadBuffer(commands, clData, CL_TRUE, 0, sizeof(float) * data_size, data, 0, NULL, NULL) ); FILE* outFile = fopen("localAddress.txt", "w"); for(int i = 0; i < data_size; ++i){ fprintf(outFile, "%f\n", data[i]); } fclose(outFile); clReleaseKernel(localAddressKernel); free(data); }

Lokális címzés (programs.cl) __kernel void localAddress(__global float* data){ int id = get_local_id(0); data[get_local_id(0) + get_group_id(0) * get_local_size(0)] = id; }

Lokális címzés

2D címzés // 2D address void address2D(){ cl_kernel address2DKernel = createKernel(program, "address2D"); const int data_size[2] = {1024, 1024}; cl_float4* data = (cl_float4*)malloc(sizeof(cl_float4)*data_size[0] * data_size[1]); cl_mem clData = clCreateBuffer(context, CL_MEM_WRITE_ONLY, sizeof(cl_float4) * data_size[0] * data_size[1], NULL, NULL); CL_SAFE_CALL( clSetKernelArg(address2DKernel, 0, sizeof(cl_mem), &clData) ); size_t workgroupSize[2] = {8, 8}; size_t workSize[2] = { data_size[0], data_size[1] }; CL_SAFE_CALL( clEnqueueNDRangeKernel(commands, address2DKernel, 2, NULL, workSize, workgroupSize, 0, NULL, NULL) ); clFinish(commands); CL_SAFE_CALL( clEnqueueReadBuffer(commands, clData, CL_TRUE, 0, sizeof(cl_float4) * data_size[0] * data_size[1], data, 0, NULL, NULL) ); FILE* outFile = fopen("2DAddress.txt", "w"); for(int i = 0; i < data_size[0] * data_size[1]; ++i){ fprintf(outFile, "G: [%f, %f] L: [%f, %f]\n", data[i].s[0], data[i].s[1], data[i].s[2], data[i].s[3]); } fclose(outFile); clReleaseKernel(address2DKernel); free(data); }

2D címzés (programs.cl) __kernel void address2D(__global float4* data){ int localIDX = get_local_id(0); int localIDY = get_local_id(1); int globalIDX = get_global_id(0); int globalIDY = get_global_id(1); data[globalIDX + get_global_size(0) * globalIDY] = (float4)(globalIDX, globalIDY, localIDX, localIDY); }

Adatfeldolgozás // square void square(){ cl_kernel squareKernel = createKernel(program, "square"); const int data_size = 1024; float* inputData = (float*)malloc(sizeof(float) * data_size); for(int i = 0; i < data_size; ++i){ inputData[i] = i; } cl_mem clInputData = clCreateBuffer(context, CL_MEM_READ_ONLY, sizeof(float) * data_size, NULL, NULL); CL_SAFE_CALL( clEnqueueWriteBuffer(commands, clInputData, CL_TRUE, 0, sizeof(float) * data_size, inputData, 0, NULL, NULL) ); float* data = (float*)malloc(sizeof(float)*data_size); cl_mem clData = clCreateBuffer(context, CL_MEM_WRITE_ONLY, sizeof(float) * data_size, NULL, NULL); CL_SAFE_CALL( clSetKernelArg(squareKernel, 0, sizeof(cl_mem), &clInputData) ); CL_SAFE_CALL( clSetKernelArg(squareKernel, 1, sizeof(cl_mem), &clData) ); CL_SAFE_CALL( clSetKernelArg(squareKernel, 2, sizeof(int), &data_size) ); size_t workgroupSize = 0; CL_SAFE_CALL( clGetKernelWorkGroupInfo(squareKernel, device_id, CL_KERNEL_WORK_GROUP_SIZE, sizeof(workgroupSize), &workgroupSize, NULL) ); size_t workSize = data_size; CL_SAFE_CALL( clEnqueueNDRangeKernel(commands, squareKernel, 1, NULL, &workSize, &workgroupSize, 0, NULL, NULL) ); clFinish(commands); CL_SAFE_CALL( clEnqueueReadBuffer(commands, clData, CL_TRUE, 0, sizeof(float) * data_size, data, 0, NULL, NULL) ); int wrong = 0; for(int i = 0; i < data_size; ++i){ if(data[i] != inputData[i] * inputData[i]){ wrong++; } std::cout << "Wrong squares: " << wrong << std::endl; clReleaseKernel(squareKernel); free(data); free(inputData); }

Adatfeldolgozás (programs.cl) __kernel void square(__global float* inputData, __global float* outputData, const int data_size){ int id = get_global_id(0); if(id < data_size){ outputData[id] = inputData[id] * inputData[id]; }

2D függvény kiértékelés // 2D function void function2D(){ cl_kernel function2DKernel = createKernel(program, "function2D"); const int data_size[2] = {1024, 1024}; cl_float4* data = (cl_float4*)malloc(sizeof(cl_float4) * data_size[0] * data_size[1]); cl_mem clData = clCreateBuffer(context, CL_MEM_WRITE_ONLY, sizeof(cl_float4) * data_size[0] * data_size[1], NULL, NULL); CL_SAFE_CALL( clSetKernelArg(function2DKernel, 0, sizeof(cl_mem), &clData) ); size_t workSize[2] = { data_size[0], data_size[1] }; CL_SAFE_CALL( clEnqueueNDRangeKernel(commands, function2DKernel, 2, NULL, workSize, NULL, 0, NULL, NULL) ); clFinish(commands); CL_SAFE_CALL( clEnqueueReadBuffer(commands, clData, CL_TRUE, 0, sizeof(cl_float4) * data_size[0] * data_size[1], data, 0, NULL, NULL) ); FILE* outFile = fopen("function2D.txt", "w"); for(int i = 0; i < data_size[0] * data_size[1]; ++i){ fprintf(outFile, "%f %f %f\n", data[i].x, data[i].y, data[i].z); } fclose(outFile); clReleaseKernel(function2DKernel); free(data); }

2D függvény kiértékelés (programs.cl) __kernel void function2D(__global float4* data){ int2 id = (int2)(get_global_id(0), get_global_id(1)); int2 globalSize = (int2)(get_global_size(0), get_global_size(1)); float2 point = (float2)(id.x / (float)globalSize.x * 6.0, id.y / (float)globalSize.y * 6.0f); data[id.x + id.y * globalSize.x] = (float4)(id.x, id.y, sin(point.x) * cos(point.y), 0); }

2D függvény kiértékelés GNUPlot – splot ‘function2D.txt’ every 1000 using 1:2:3 with dots