@@ -92,42 +92,27 @@ To get the execution sm for a given kernel, it can invoke :
# PRUDA usage by example
In this section, we will show how PRUDA can be used by calling two
periodic real-time tasks on the GPU. The first does the array sum and
the other for array multiplication.
First to use pruda, you must incluse the following header files:
```c
#include"../inc/user.h"
#include"../inc/tools.h"
#define N 8
__global__voidadd(int*a,int*b,int*c){
printf("here 2 \n");
inttid=blockDim.x*blockIdx.x+threadIdx.x;
while(tid<N){
c[tid]=a[tid]+b[tid];
tid+=blockDim.x;
}
}
```
__global__voidmul(int*a,int*b,int*c,inth){
inttid=blockDim.x*blockIdx.x+threadIdx.x;
while(tid<N){
c[tid]=a[tid]*b[tid];
tid+=blockDim.x;
}
}
The user defines further its kernel in a classical way as follows:
```c
__global__voidadd(int*a,int*b,int*c)...;
__global__voidmul(int*a,int*b,int*c,inth)...;
```
Further, the user writes it main function by first allocating the memory spaces of both CPU and GPU by the mean of malloc, cudaMalloc or cudaMallocManaged as in follows
intmain(){
// initializing pointers
```c
int*a,*b,*c;
int*dev_a,*dev_b,*dev_c,ac;
...
...
@@ -136,20 +121,30 @@ int main(){
b=(int*)malloc(N*sizeof(int));
c=(int*)malloc(N*sizeof(int));
...initvars
// allocating GPU memory
cudaMalloc((void**)&dev_a,N*sizeof(int));
cudaMalloc((void**)&dev_b,N*sizeof(int));
cudaMalloc((void**)&dev_c,N*sizeof(int));
```
// initializing the list of kernels
init_kernel_listing();
The user must now create kernels list. Therefore the user has a
predefined list in the file *user.cu*. The user calls
**get_listing()** and then initialize each kernel with the kernel
code, number of blocks and number of threads per block and finally the
kernel parameters as follows:
```c
// initializing the list of kernels init_kernel_listing();