So far from the previous blog content on NVIDIA Nano Jetson development board ( refer previous blog 1 , 2, 3 ) we have come across the usage of Nano Jetson board and its parallel computing Cuda platform.
In this section, we shall look in to the practical approach to find out the solution to the question that
” Why do we exactly need to program with Cuda when it comes to parallel computing ? ”
We will be focusing more into the performance analysis of parallel computing GPU with Cuda C & further compare it with CPU computation with C code. Thus purpose of this blog is to focus on the advantages of GPU over CPU when it comes to complicated mathematical operation. The performance analysis is done based on the time required in the execution of code .
The flow graph is directed as mentioned below
- Difference between the CPU & GPU
- Writing a code on addition of integer in both ( C & Cuda C )
- Writing a code on Multiplication of 200×200 matrix in both ( c & Cuda C)
- Comparative analysis on performance of C & Cuda C
- Conclusion
Comparisons between CPU & GPU computation :
Code for “Addition of Integers” in C & Cuda C :
———————————————————————————————————————————————————————————————–
Compile and execute the C code using command mention below:
$ gcc filename.c –o filename
$. /filename
———————————————————————————————————————————————————————————————–
———————————————————————————————————————————————————————————————–
Compiling and executing the Cuda C code with the command mention below:
$ nvcc filename.cu
$. /a.out
———————————————————————————————————————————————————————————————–
Code for “Multiplication of 200×200 Matrix” using C & Cuda C :
———————————————————————————————————————————————————————————————–
———————————————————————————————————————————————————————————————–
Based on the comparison , here are the conclusion table mention below :
Comparative analysis of execution of code in C & Cuda C
Conclusion :
Each code is executed Multiple time in order to identify the variation in the time consumed for execution of code. Considering the example of addition of integer , we can conclude that the CPU execute faster for the simple mathematical computation as compared to GPU with approximately 0.25 ms . Further considering the matrix multiplication with 200X200 element , we can observe that the time execution of Cuda C (i,e for GPU computing ) execute approximately at 16 ms .(which almost 70 to 80 times faster than the CPU computation ) .
Thus GPU computation seems to have an advantages over the CPU computation , when it deal with the execution of complex signals consisting of complicated mathematical equation or matrix.
For purchase of NVIDIA product :
https://www.tenettech.com/product/nvidia-jetson-nano-developer-kit