CUDA(Ⅸ):CUDA Tools for Optimal Performance and Productivity

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/qq_24990189/article/details/89708838

目录

1.Profiling tools:

CPU:

GPU:

2.CPU Timing Functions:

Use high-precision OS calls:

Be carefule when using CPU-based timing calls to measure CUDA activity:

CUDA Timing Functions with CUDA Event API:

3.NVIDIA Visual Profiler

概述:

使用:


1.Profiling tools:

CPU:

Interl Vtune Amplifer XE、GNU gprof

GPU:

NVIDIA Visual Profier 、NVIDIA Nsight for CUDA code

2.CPU Timing Functions:

Use high-precision OS calls:

gettimeofday() in Linux

QueryPerfromanceCounter() in Windows

Be carefule when using CPU-based timing calls to measure CUDA activity:

CUDA activity is often asynchronous(kernel launches , asynchronous memory coppies)

Need proper synchronization before results are meaningful

CUDA Timing Functions with CUDA Event API:

cudaEvent_t start,stop;
float elapsed;
cudaEventCreate(&start);
cudaEventCreate(&stop);
cudaEventRecord(start, 0);

fool_kernel <<< grid , block >>> ();

cudaEventRecord(stop , 0);

#cudaEventSynchronize() is required since cudaEventRecord() is asynchronize
cudaEventSynchronize(stop);
cudaEventElapseTime(&elapsed , start , stop);
printf("Elapsed time %f (second)\n", elapsed/1000)

3.NVIDIA Visual Profiler

概述:

Unified CPU and GPU Timeliine

GUided Performance Analysis

For Linux、Mac OS X and Windows

NVIDIA Nsight provides similar performance informance

使用:

wlsh@wlsh-ThinkStation:~$ nvvp

 

 

猜你喜欢

转载自blog.csdn.net/qq_24990189/article/details/89708838